GithubHelp home page GithubHelp logo

holukas / dyco Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 173.11 MB

Normalize lag times across files

License: GNU General Public License v3.0

Python 90.84% TeX 9.16%
python time-series-analysis flux ecosystems eddy-covariance time-delay time-lag

dyco's Introduction

Normalization example

DOI

DOI

DYCO - Dynamic Lag Compensation

The lag detection between the turbulent departures of measured wind and the scalar of interest is a central step in the calculation of eddy covariance (EC) ecosystem fluxes. In case the covariance maximization fails to detect a clear peak in the covariance function between the wind and the scalar, current flux calculation software can apply a constant default (nominal) time lag to the respective scalar. However, both the detection of clear covariance peaks in a pre-defined time window and the definition of a reliable default time lag is challenging for compounds which are often characterized by low SNR, such as N2O. In addition, the application of one static default time lag may produce inaccurate results in case systematic time shifts are present in the raw data.

DYCO is meant to assist current flux processing software in the calculation of fluxes for compounds with low SNR. In the context of current flux processing schemes, the unique features offered as part of the DYCO package include:

  • (i) the dynamic application of progressively smaller time windows during lag search for a reference compound (e.g. CO2),
  • (ii) the calculation of default time lags on a daily scale for the reference compound,
  • (iii) the application of daily default reference time lags to one or more target compounds (e.g. N2O)
  • (iv) the dynamic normalization of time lags across raw data files,
  • (v) the automatic correction of systematic time shifts in target raw data time series, e.g. due to failed synchronization of instrument clocks, and
  • (vi) the application of instantaneous reference time lags, calculated from lag-normalized files, to one or more target compounds.

As DYCO aims to complement current flux processing schemes, final lag-removed files are produced that can be directly used in current flux calculation software.

Scientific background

In ecosystem research, the EC method is widely used to quantify the biosphere-atmosphere exchange of greenhouse gases ( GHGs) and energy (Aubinet et al., 2012; Baldocchi et al., 1988). The raw ecosystem flux (i.e. net exchange) is calculated by the covariance between the turbulent vertical wind component measured by a sonic anemometer and the entity of interest, e.g. CO2, measured by a gas analyzer. Due to the application of two different instruments, wind and gas are not recorded at exactly the same time, resulting in a time lag between the two time series. For the calculation of ecosystem fluxes this time delay has to be quantified and corrected for, otherwise fluxes are systematically biased. Time lags for each averaging interval can be estimated by finding the maximum absolute covariance between the two turbulent time series at different time steps in a pre-defined time window of physically possible time-lags (e.g., McMillen, 1988; Moncrieff et al., 1997). Lag detection works well when processing fluxes for compounds with high signal-to-noise ratio (SNR), which is typically the case for e.g. CO2. In contrast, for compounds with low SNR (e.g., N2O, CH4) the cross-covariance function with the turbulent wind component yields noisier results and calculated fluxes are biased towards larger absolute flux values (Langford et al., 2015), which in turn renders the accurate calculation of yearly ecosystem GHG budgets more difficult and results may be inaccurate.

One suggestion to adequately calculate fluxes for compounds with low SNR is to first calculate the time lag for a reference compound with high SNR (e.g. CO2) and then apply the same time lag to the target compound of interest (e.g. N2O), with both compounds being recorded by the same analyzer (Nemitz et al., 2018). DYCO follows up on this suggestion by facilitating the dynamic lag-detection between the turbulent wind data and a reference compound and the subsequent application of found reference time lags to one or more target compounds.

Processing chain

DYCO processing chain

Figure 1. The DYCO processing chain.

DYCO uses eddy covariance raw data files as input and produces lag-compensated raw data files as output.

The full DYCO processing chain comprises four phases and several iterations during which reference lags are refined iteratively in progressively smaller search windows (Figure 1). Generally, the reference lag search is facilitated by prior normalization of default (nominal) time lags across files. This is achieved by compensating reference and target time series data for daily default reference lags, calculated from high-quality reference lags available around the respective date (Figure 2). Due to this normalization, reference lags fall into a specific, pre-defined and therefore known time range, which in turn allows the application of increasingly narrower time windows during lag search. This approach has the advantage that reference lags can be calculated from a reference signal that shows clear peaks in the cross-covariance analysis with the wind data and thus yields unambiguous time lags due to its high SNR. In case the lag search failed to detect a clear time delay for the reference variable (e.g. during the night), the respective daily default reference lag is used instead. Reference lags can then be used to compensate target variables with low SNR for detected reference time delays.

A description of the different Phases along with output examples can be found in the Wiki: Processing Chain.

Normalization example

Figure 2. Example showing the normalization of default reference time lags across files. Shown are found instantaneous time lags (red) between turbulent wind data and turbulent CO2 mixing ratios, calculated daily reference default lags (yellow bars), normalization correction (blue arrows) and the daily default reference lag after normalization (green bar). Negative lag values mean that the CO2 signal was lagged behind the wind data, e.g. -400 means that the instantaneous CO2 records arrived 400 records (corresponds to 20s in this example) later at the analyzer than the wind data. Daily default reference lags were calculated as the 3-day median time lag from a selection of high-quality time lags, i.e. when cross-covariance analyses yielded a clear covariance peak. The normalization correction is applied dynamically to shift the CO2 data so that the default time lag is found close to zero across files. Note the systematic shift in time lags starting after 27 Oct 2016.

Installation

DYCO can be installed via pip:

pip install dyco

Usage

DYCO is run from the command line interface (CLI).

usage: dyco.py [-h] [-i INDIR] [-o OUTDIR] [-fnd FILENAMEDATEFORMAT] [-fnp FILENAMEPATTERN] [-flim LIMITNUMFILES] [-fgr FILEGENRES] [-fdur FILEDURATION] [-dtf DATATIMESTAMPFORMAT] [-dres DATANOMINALTIMERES] [-lss LSSEGMENTDURATION] [-lsw LSWINSIZE] [-lsi LSNUMITER] [-lsf {0,1}] [-lsp LSPERCTHRES] [-lt TARGETLAG] [-del {0,1}] var_reference var_lagged var_target [var_target ...]

  • Example usage with full arguments can be found here: Example
  • For an overview of arguments see here: Usage
  • DYCO creates a range of output folders which are described here: Results Output Folders

Documentation

Please refer to the Wiki for documentation and additional examples.

Real-world examples

The ICOS Class 1 site Davos (CH-Dav), a subalpine forest ecosystem station in the east of Switzerland, provides one of the longest continuous time series (24 years and running) of ecosystem fluxes globally. Since 2016, measurements of the strong GHG N2O are recorded by a closed-path gas analyzer that also records CO2. To calculate fluxes using the EC method, wind data from the sonic anemometer is combined with instantaneous gas measurements from the gas analyzer. However, the air sampled by the gas analyzer needs a certain amount of time to travel from the tube inlet to the measurement cell in the analyzer and is thus lagged behind the wind signal. The lag between the two signals needs to be compensated for by detecting and then removing the time lag at which the cross-covariance between the turbulent wind and the turbulent gas signal reaches the maximum absolute value. This works generally well when using CO2 (high SNR) but is challenging for N 2O (low SNR). Using covariance maximization to search for the lag between wind and N2O mostly fails to accurately detect time lags between the two variables (noisy cross-correlation function), resulting in relatively noisy fluxes. However, since N2O has similar adsorption / desorption characteristics as CO2 it is valid to assume that both compounds need approx. the same time to travel through the tube to the analyzer, i.e. the time lag for both compounds in relation to the wind is similar. Therefore, DYCO can be applied (i) to calculate time lags across files for CO2 (reference compound), and then (ii) to remove found CO2 time delays from the N2O signal (target compound). The lag-compensated files produced by DYCO can then be used to calculate N2O fluxes. Since DYCO normalizes time lags across files and compensated the N2O signal for instantaneous CO2 lags, the true lag between wind and N2O can be found close to zero, which in turn facilitates the application of a small time window for the final lag search during flux calculations.

Another application example are managed grasslands where the biosphere-atmosphere exchange of N2O is often characterized by sporadic high-emission events (e.g., Hörtnagl et al., 2018; Merbold et al., 2014). While high N 2O quantities can be emitted during and after management events such as fertilizer application and ploughing, fluxes in between those events typically remain low and often below the limit-of-detection of the applied analyzer. In this case, calculating N2O fluxes works well during the high-emission periods (high SNR) but is challenging during the rest of the year (low SNR). Here, DYCO can be used to first calculate time lags for a reference gas measured in the same analyzer (e.g. CO2, CO, CH4) and then remove reference time lags from the N2O data.

Contributing

All contributions in the form of code, bug reports, comments or general feedback are always welcome and greatly appreciated! Credit will always be given.

  • Pull requests: If you added new functionality or made the DYCO code run faster (always welcome), please create a fork in GitHub, make the contribution public in your repo and then issue a pull request. Please include tests in your pull requests.
  • Issues: If you experience any issue, please use the issue tracker to submit it as an issue ticket with the label 'bug'. Please also include a minimal code example that produces the issue.
  • Feature request: If there is a feature that you would like to see in a later version, please use the issue tracker to submit it as an issue ticket with the label 'feature request'.
  • Contact details: For direct questions or enquiries the maintainer of this repository can be contacted directly by writing an email with the title "DYCO" to: [email protected]

Acknowledgements

This work was supported by the Swiss National Science Foundation SNF (ICOS CH, grant nos. 20FI21_148992, 20FI20_173691) and the EU project Readiness of ICOS for Necessities of integrated Global Observations RINGO (grant no. 730944).

References

Aubinet, M., Vesala, T., Papale, D. (Eds.), 2012. Eddy Covariance: A Practical Guide to Measurement and Data Analysis. Springer Netherlands, Dordrecht. https://doi.org/10.1007/978-94-007-2351-1

Baldocchi, D.D., Hincks, B.B., Meyers, T.P., 1988. Measuring Biosphere-Atmosphere Exchanges of Biologically Related Gases with Micrometeorological Methods. Ecology 69, 1331–1340. https://doi.org/10.2307/1941631

Hörtnagl, L., Barthel, M., Buchmann, N., Eugster, W., Butterbach-Bahl, K., Díaz-Pinés, E., Zeeman, M., Klumpp, K., Kiese, R., Bahn, M., Hammerle, A., Lu, H., Ladreiter-Knauss, T., Burri, S., Merbold, L., 2018. Greenhouse gas fluxes over managed grasslands in Central Europe. Glob. Change Biol. 24, 1843–1872. https://doi.org/10.1111/gcb.14079

Langford, B., Acton, W., Ammann, C., Valach, A., Nemitz, E., 2015. Eddy-covariance data with low signal-to-noise ratio: time-lag determination, uncertainties and limit of detection. Atmospheric Meas. Tech. 8, 4197–4213. https://doi.org/10.5194/amt-8-4197-2015

McMillen, R.T., 1988. An eddy correlation technique with extended applicability to non-simple terrain. Bound.-Layer Meteorol. 43, 231–245. https://doi.org/10.1007/BF00128405

Merbold, L., Eugster, W., Stieger, J., Zahniser, M., Nelson, D., Buchmann, N., 2014. Greenhouse gas budget (CO 2 , CH4 and N2O) of intensively managed grassland following restoration. Glob. Change Biol. 20, 1913–1928. https://doi.org/10.1111/gcb.12518

Moncrieff, J.B., Massheder, J.M., de Bruin, H., Elbers, J., Friborg, T., Heusinkveld, B., Kabat, P., Scott, S., Soegaard, H., Verhoef, A., 1997. A system to measure surface fluxes of momentum, sensible heat, water vapour and carbon dioxide. J. Hydrol. 188–189, 589–611. https://doi.org/10.1016/S0022-1694(96)03194-0

Nemitz, E., Mammarella, I., Ibrom, A., Aurela, M., Burba, G.G., Dengel, S., Gielen, B., Grelle, A., Heinesch, B., Herbst, M., Hörtnagl, L., Klemedtsson, L., Lindroth, A., Lohila, A., McDermitt, D.K., Meier, P., Merbold, L., Nelson, D., Nicolini, G., Nilsson, M.B., Peltola, O., Rinne, J., Zahniser, M., 2018. Standardisation of eddy-covariance flux measurements of methane and nitrous oxide. Int. Agrophysics 32, 517–549. https://doi.org/10.1515/intag-2017-0042

dyco's People

Contributors

danielskatz avatar holukas avatar

Watchers

 avatar

dyco's Issues

Software Paper comments

State of the Field

  • Many other packages that relate to eddy-covariance data. Does DYCO perform a unique function? Is it intended to be used alongside other packages? Independently? I would love a statement mapping out how/if it related to other eddy covariance packages.

Documentation comments

Statement of need

  • Your About heading is pretty technical. Your Scientific Background header sets up the why DYCO is important really well. I'd switch the order of those two paragraphs, and rename the "About" heading, or provide a short summary of your scientific background in your about section.

  • Who is your target audience?

Community Guidelines
DYCO could use a quick statement outlining how to:

  • Report issues or problems DYCO

  • Seek support if DYCOdoesn't work properly

  • Because DYCO is/was tied to a ETH account, there should be a paragraph explaining how non ETH members can/should interact with DYCO when contributing, reporting issues or seeking help. Possibly this won’t be an issue if the wiki link is changed. Thanks again for migrating @holukas

Example Usage
Note, I'm currently not able to view the text of the wiki, so the below is from my notes. I can link to exact lines once the wiki link is updated

  • In the text for the Results from Phase 4 figure has the sentence

"In this example, the covariance corresponds to the raw ecosystem flux of CO2, with negative values translating to CO2-uptake by the ecosystem (CO2 sink) and positive values translating to CO2-emission (CO2 source)."

This phrasing makes the axes on the Phase 4 diagrams difficult to interpret. Is this covariance in CO2 measurements? This sentence suggests that negative “covariance” values should be interpreted literally as fluxes? Clearer units on the axes could clear this up.

  • The Scientific Background talks about using Dyco for N2O fluxes, but the examples are only of CO2 fluxes. Could the example show how N2O monitoring is improved?

Example

A few notes on the Example.

20161024100000.csv 20161024110000.csv 20161024120000.csv 20161024130000.csv 20161024140000.csv
20161024103000.csv 20161024113000.csv 20161024123000.csv 20161024133000.csv 20161024143000.csv

Please clarify that those files are available in example/input_data

Test commented out

I see in test_dyco.py that at least one test is commented out:


    # def test_detect_covariance_peaks(self):
    #     """Test peak detection only"""
    #     filepath = 'test_data/test_raw_data/20161020113000.csv'
    #     segment_df = files.read_raw_data(filepath=filepath, data_timestamp_format='%Y-%m-%d %H:%M:%S.%f')
    #     lagsearch_df = lag.LagSearch.setup_lagsearch_df(win_lagsearch=[-1000, 1000],
    #                                                     shift_stepsize=10,
    #                                                     segment_name='20161031230000_iter1')
    #     lagsearch_df = \
    #         lag.LagSearch.find_max_cov_peak(segment_df=segment_df,
    #                                         lagsearch_df=lagsearch_df,
    #                                         ref_sig='w_ms-1_rot_turb',
    #                                         lagged_sig='co2_ppb_qcl_turb')
    #     lagsearch_df, props_peak_auto = lag.LagSearch.find_peak_auto(df=lagsearch_df)
    #
    #     self.assertEqual(lagsearch_df.loc[lagsearch_df['flag_peak_max_cov_abs'] == 1, 'shift'].values[0], -290)
    #     self.assertEqual(lagsearch_df.loc[lagsearch_df['flag_peak_auto'] == 1, 'shift'].values[0], -290)
    #     self.assertEqual(lagsearch_df.loc[lagsearch_df['flag_peak_max_cov_abs'] == 1, 'cov_abs'].values[0],
    #                      223.13887346667508)

Why is this? Please either fix the test or remove the dead code.

openjournals/joss-reviews/issues/2575

Friendlier wiki

The link to the wiki from the README drops onto a page from which it's not obvious where the user should go. I think it would be better if the README dropped the user onto a friendly landing page in the wiki from which it would be clear where to start.

In particular the page named "DYCO Wiki" appears to just duplicate the existing landing page.

openjournals/joss-reviews/issues/2575

Inputs/outputs

I feel like it's a little unclear to me from the README what the inputs and outputs of the software are.

Is it possible to summarize very briefly what it does? I feel like there's currently a wall-of-text issue.

openjournals/joss-reviews/issues/2575

Command-line usage

The description here indicates that DYCO creates a number of folders and output files.

I think this isn't really appropriate for a tool that's expected to be used within Python. If this is a library, it's outputs should be variables that can be used programmatically. If this isn't a library, then there should be a command-line way to interact with the script.

openjournals/joss-reviews/issues/2575

'Wiki' link now behind ETH login

The "Wiki" link is now behind an ETH login.

When the account was hosted by ETH I was able to access the Wiki and the documentation and the installation instructions, so I know they are there, but now I can't. Hopefully the link just can be updated?

Once the wiki content is available again, I would consider the Installation instructions and most of Example Usage check marks complete.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.