GithubHelp home page GithubHelp logo

ouranosinc / xclim Goto Github PK

View Code? Open in Web Editor NEW
317.0 17.0 55.0 58.88 MB

Library of derived climate variables, ie climate indicators, based on xarray.

Home Page: https://xclim.readthedocs.io/en/stable/

License: Apache License 2.0

Makefile 0.19% Python 99.81%
climate-analysis climate-science python xarray icclim netcdf4 xclim anuclim dask

xclim's Introduction

xclim: Climate services library Xclim

Versions Python Package Index Build Conda-forge Build Version Supported Python Versions
Documentation and Support Documentation Status Static Badge
Open Source License FAIR Software Compliance OpenSSF Scorecard DOI pyOpenSci JOSS
Coding Standards Python Black Ruff pre-commit.ci status Open Source Security Foundation FOSSA
Development Status Project Status: Active – The project has reached a stable, usable state and is being actively developed. Build Status Coveralls

xclim is an operational Python library for climate services, providing numerous climate-related indicator tools with an extensible framework for constructing custom climate indicators, statistical downscaling and bias adjustment of climate model simulations, as well as climate model ensemble analysis tools.

xclim is built using xarray and can seamlessly benefit from the parallelization handling provided by dask. Its objective is to make it as simple as possible for users to perform typical climate services data treatment workflows. Leveraging xarray and dask, users can easily bias-adjust climate simulations over large spatial domains or compute indices from large climate datasets.

For example, the following would compute monthly mean temperature from daily mean temperature:

import xclim
import xarray as xr

ds = xr.open_dataset(filename)
tg = xclim.atmos.tg_mean(ds.tas, freq="MS")

For applications where metadata and missing values are important to get right, xclim provides a class for each index that validates inputs, checks for missing values, converts units and assigns metadata attributes to the output. This also provides a mechanism for users to customize the indices to their own specifications and preferences. xclim currently provides over 150 indices related to mean, minimum and maximum daily temperature, daily precipitation, streamflow and sea ice concentration, numerous bias-adjustment algorithms, as well as a dedicated module for ensemble analysis.

Quick Install

xclim can be installed from PyPI:

$ pip install xclim

or from Anaconda (conda-forge):

$ conda install -c conda-forge xclim

Documentation

The official documentation is at https://xclim.readthedocs.io/

How to make the most of xclim: Basic Usage Examples and In-Depth Examples.

Conventions

In order to provide a coherent interface, xclim tries to follow different sets of conventions. In particular, input data should follow the CF conventions whenever possible for variable attributes. Variable names are usually the ones used in CMIP6, when they exist.

However, xclim will always assume the temporal coordinate is named "time". If your data uses another name (for example: "T"), you can rename the variable with:

ds = ds.rename(T="time")

Contributing to xclim

xclim is in active development and is being used in production by climate services specialists around the world.

  • If you're interested in participating in the development of xclim by suggesting new features, new indices or report bugs, please leave us a message on the issue tracker.
    • If you have a support/usage question or would like to translate xclim to a new language, be sure to check out the existing Static Badge first!
  • If you would like to contribute code or documentation (which is greatly appreciated!), check out the Contributing Guidelines before you begin!

How to cite this library

If you wish to cite xclim in a research publication, we kindly ask that you refer to our article published in The Journal of Open Source Software (JOSS): https://doi.org/10.21105/joss.05415

To cite a specific version of xclim, the bibliographical reference information can be found through Zenodo

License

This is free software: you can redistribute it and/or modify it under the terms of the Apache License 2.0. A copy of this license is provided in the code repository (LICENSE).

Credits

xclim development is funded through Ouranos, Environment and Climate Change Canada (ECCC), the Fonds vert and the Fonds d'électrification et de changements climatiques (FECC), the Canadian Foundation for Innovation (CFI), and the Fonds de recherche du Québec (FRQ).

This package was created with Cookiecutter and the audreyfeldroy/cookiecutter-pypackage project template.

xclim's People

Contributors

aaronspring avatar agstephens avatar aulemahal avatar balinus avatar beauprel avatar bzah avatar cehbrecht avatar coxipi avatar davidcaron avatar dependabot[bot] avatar dougiesquire avatar hem-w avatar huard avatar jamiejquinn avatar javierdiezsierra avatar jeremyfyke avatar juliettelavoie avatar lamadr avatar marielabonte avatar pre-commit-ci[bot] avatar profesorpaiche avatar raquelalegre avatar rondeaug avatar sarahg-579462 avatar saschahofmann avatar sbiner avatar thomasjkeel avatar tlogan2000 avatar vindelico avatar zeitsperre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xclim's Issues

Enforcing PEP8 Conventions

Flake8 is a library for examining PEP8 violations within a code base. For those that don't know, PEP8 is a body of code composition rules that ensure the code is readable/maintainable and so that old/unused/hazardous conventions (like imports that can invisibly rewrite base functions; e.g. F403: "from library import *") are gradually removed from Python code library.

After changes introduced to builds from #25, we now have a Python3.6 test-build with Tox-Travis that evaluates whether a PR (both within the xclim and tests folders) is consistent with PEP8. As it is presently set, any violation of PEP8 is a failed build and a failed build from Travis will prevent code coverage checks (ie: if the build is failing, why bother seeing if coverage has decreased). This is a little too strict and probably not optimal at this point.

For that reason, we need to do the following:

  • Write code like we normally do.
  • Within a terminal (or via a test configuration in PyCharm - I can help set it up) run $ flake8 xclim and $ flake8 tests to find PEP8 violations.
  • Fix the PEP8 violations and/or add specific PEP8 conventions to ignore.

If we're all using PyCharm, I can create a configuration library that highlights or marks as errors the conventions that we decide to enforce (we should strive to be strict with conventions, but I get it).

TL;DR: I would like to know if there are any PEP8 conventions that we don't want to follow. We can opt to ease up on PEP8 violations for now and be more strict in the future. The xclim library is fully PEP8 compliant but the tests as they are written break the following conventions:

  • F403: Wildcard import (from library import *)
  • F405: Function may be undefined or defined from a * import
  • F841: Unassigned variable

Once the code base is either 100% compliant or we allow some wiggle room, builds will start reporting code coverage and whether they compile on Python 2.7, 3.4, 3.5, and 3.6.

Benchmark Processing Libraries for Comparisons

Libraries we've identified as being useful for grid manipulation:

Feel free to edit this post to add or remove what we are certain to look/not look into.

Benchmarking libraries (see https://perf.readthedocs.io/en/latest/)

  • performance: the Python benchmark suite which uses perf
  • Airspeed Velocity: A simple Python benchmarking tool with web-based reporting
  • pytest-benchmark
  • boltons.statsutils

Compare outputs with icclim

Description

Include comparison of ICCLIM results into the tests.

In particular, check how ICCLIM handles spells that last longer than the resampling period.

Indicators for OPG

Check box when indices are coded, include a docstring and are accompanied by unit tests.

Indicator Description Units Done
Heat wave index Days that are part of a heat wave (5 or more consecutive days with Tmax > 25°C) # of days / year
Cold spell index Days that are part of a cold spell (5 or more consecutive days with Tavg < -10°C) # of days / year
Freeze/thaw cycles days when Tmax >0°C and Tmin < 0°C # of days / year
Very hot days Days when Tmax > 30°C # of days / year
Cooling degree days Air conditioning indicator: number of degrees a given day’s Tavg is >18°C Sum of degree days / year
Heating degree days Heating indicator: number of degrees a given day’s Tavgis <18°C Sum of degree days / year
Freshet start First day of year that is followed by >5 consecutive days each having Tavg > 0°C Day of year
Winter rain Ratio of rainfall to total precipitation during December/January/February(rainfall assumed when Tavg > 0°C) ratio
Possible rain-on-snow events Days with rainfall that followed 7 or more days of Tavg< 0°C    (rainfall assumed when Tavg > 0°C) # of days / year
Maximum 3-day spring precipitation Maximum 3-day total precipitation during spring (March-May) mm
Mean Monthly Temperature Mean monthly temperature °C
Max Monthly Temperature Max monthly temperature (daily max of month) °C
Min Monthly Temperature Min monthly temperature (daily min of month) °C
Total Monthly Precipitation Sum of all precipitation during a month mm
Frost Days Number of days with Tmin <0°C Days
Consecutive dry days Maximum number of consecutive days with precip < 1mm Days
Heavy precipitation days index Number of days when precipitation >10mm Days
Heavy precipitation days Number of days when precipitation >20mm days
Monthly Temperature variability Standard deviation of monthly temperature (over e.g 30-year period) °C
Monthly Precipitation variability Coefficient of variation of monthly precipitation (over e.g. 30-year period) -

Decorators - Daily time step and Units conversion

Description

XCLIM assumes a daily time-step (currently) :

  • Create decorator to test monotonic daily time

All functions should also be decorated with a units converter decorator:

  • Temp units to Kelvin (from C and F(?))
  • Pr units to 'mm' (from kg m-2 s-1 and mm s-1, others?)

In my mind these replace the various 'check_valid' decorators (which currently do not convert and simply check) and are potentially too strict on CF-Compliance ? CF-checking could eventually be done on WPS PAVICS side for non expert users. (i.e. we assume that internal XCLIM users know what they are doing)

Get the autodoc to work on readthedocs

Description

Make sure the API docs build on readthedocs.
May need to set a conda environment for readthedocs.

See bird-house for an example of how this is done.

readthedocs.yml

build:
    image: latest
 python:
    version: 3.6
 conda:
    file: environment.yml

test_365_day broken on master

For some reason, TestDailyDownsampler::test_365_day is broken even though it appeared to be building fine on the PR. The merge to master this morning is erroring out on all branches except Py34. What I'm seeing:

self = <test_utils.TestDailyDownsampler object at 0x7f589280ae48>

    def test_365_day(self):
    
        # 365_day calendar
        # generate test DataArray
        units = 'days since 2000-01-01 00:00'
        time_365 = cftime.num2date(np.arange(0, 1 * 365), units, '365_day')
        da_365 = xr.DataArray(np.arange(time_365.size), coords=[time_365], dims='time', name='data')
        units = 'days since 2001-01-01 00:00'
        time_std = cftime.num2date(np.arange(0, 1 * 365), units, 'standard')
        da_std = xr.DataArray(np.arange(time_std.size), coords=[time_std], dims='time', name='data')
    
        for freq in 'YS MS QS-DEC'.split():
>           resampler = da_std.resample(time=freq)

@huard any idea what's going on?

Add xclim to PyPi

I'll be parking xclim on the PyPi in order to ensure that the name doesn't get taken. This will make it so that any person running Python will be able to download this library using the following command:

pip install xclim

Resample / Custom resample - mask incomplete periods

Description

In both native resample and @sbiner 's custom 'daily_downsampler' current results keep incomplete periods as valid data.

For now the main issue is seasonal resampling of QS-DEC. For example the DJF value of the intial year of multfile dataset (containing only Jan/Feb) has valid data and starting time stamp of DEC of year-1. Similarly, the last Dec in a multi file data array becomes a valid DJF season even when only 1 month is present.

I suggest that the xclim default should mask (assign nans?) the time steps. XCLIM would have a default behavoir 'mask_incplt=True', which could be changed to 'False' return to default xarray resample behavior if wanted?
I think a mechanism like this will help us long-term when eventually dealing with station data (lots of missing values and potential problems not just in the beginning / end of the calculations)

thoughts?
Ideas on implementing this?

Rename this library to "xclim"

HailStorm is regrettably taken on the PyPi so @tlogan2000 and I are thinking of renaming this library to pressure-drop (importing as drp).

Why pressure-drop? Why not?

This should be fairly painless. I'll be uploading a version 0.0.1 to park the domain on PyPi so we don't run into this issue again.

Support for GRIB files

Description

Make sure xclim indices work with grib files.
The main issue is probably the install process of dependencies required for xarray to support grib (pynio). I've played a bit and it seems the conda-forge version is not compatible with the libraries installed in the default conda channel. My understanding is that we need to specify conda-forge as the default channel to get it to work.

That probably means adding an environment.yml file to xclim.

See http://tech.weatherforce.org/blog/multicore-grib-processing-xarray/index.html for an example.

Fit distributions using L-moments in addition to ML

We should be able to fit these six distributions using both maximum likelihood and the method of L-Moments:

  • Gumbel (right)
  • GEV
  • LogNormal2 (loc=0)
  • LogNormal3
  • Pearson type III
  • Log-Pearson type III

This new functionality would be exposed in indices.generic.fit, fa and frequency_analysis as a new parameter estimator taking the default value ml for the default scipy ML estimator, and lmom for L-moments estimator.

Handling Requirements

@huard

In order to get Travis to install properly, I needed to hard code the requirements in setup.py:

xclim/setup.py

Line 17 in 6808fd2

requirements = ['dask', 'numpy', 'scipy', 'xarray', ]

I'm wondering though if we should try and reimplement the previous model, seeing as the error that likely caused it was that we didn't specify the working directory, like this cookie-cutter templates does:

https://github.com/wdm0006/cookiecutter-pipproject/blob/master/{{Bcookiecutter.app_name}}/setup.py

Having a requirements.txt is good for pinning versions of libraries as they're integrated, but having them hard-coded in the setup.py prevents us from just putting a ton of packages in there with pip freeze -r requirements.txt and running into an "everything plus the kitchen sink" scenario.

Thoughts?

Extend the xarray/xclim object class to accept "groupby()" method

Building on the non-standard calendar issue raised in #30, it would be good to see if there are ways of extending the DataArray class to allow us to perform resampling via class methods.

Since this DataArray class is built from xarray and pandas, it might be worthwhile to build out the existing class to include groupby() and potentially other methods. I'll be starting a new branch called class_extension to look into ways of adding these methods. Help appreciated.

Names for indices

Description

I suggest the indices.py file uses long, descriptive names for indices (heating_degree_days) instead of hdd. Then, in the ICCLIM module, we can map the abbreviations to the long names. I think it will make the library more "readable".

Quality control checks

Description

ICCLIM performs quality control checks on input data depending on the variable. This is something we could add to XCLIM.

See chapter 3 of ICCLIM docs

daily precipitation amount RR:

  • must be equal or exceed 0 mm
  • must be less than 300.0 mm
  • must not be repetitive (i.e. exactly the same amount) for 10 days in a row if amount larger than 1.0 mm
  • must not be repetitive (i.e. exactly the same amount) for 5 days in a row if amount larger than 5.0 mm
  • dry periods receive flag = 1 (suspect), if the amount of dry days lies outside a 14·bivariate standard deviation

daily maximum temperature TX:

  • must exceed -90.0 ◦ C
  • must be less than 60.0 ◦ C
  • must exceed or equal daily minimum temperature (if exists)
  • must exceed or equal daily mean temperature (if exists)
  • must not be repetitive (i.e. exactly the same) for 5 days in a row
  • must be less than the long term average daily maximum temperature for that calendar day + 5 times standard deviation (calculated for a 5 day window centered on each calendar day over the whole period)
  • must exceed the long term average daily maximum temperature for that calendar day - 5 times standard deviation (calculated for a 5 day window centered on each calendar day over the whole period)

Daily minimum temperature TN:

  • must exceed -90.0 ◦ C
  • must be less than 60.0 ◦ C
  • must be less or equal to daily maximum temperature (if exists)
  • must be less or equal to daily mean temperature (if exists)
  • must not be repetitive (i.e. exactly the same) for 5 days in a row
  • must be less than the long term average daily minimum temperature for that calendar day + 5 times standard deviation (calculated for a 5 day window centered on each calendar day over the whole period)
  • must exceed the long term average daily minimum temperature for that calendar day - 5 times standard deviation (calculated for a 5 day window centered on each calendar day over the whole period)

Daily mean temperature TG:

  • must exceed -90.0 ◦ C
  • must be less than 60.0 ◦ C
  • must exceed or equal daily minimum temperature (if exists)
  • must be less or equal to daily maximum temperature (if exists)
  • must not be repetitive (i.e. exactly the same) for 5 days in a row
  • must be less than the long term average daily mean temperature for that calendar day + 5 times standard deviation (calculated for a 5 day window centered on each calendar day over the whole period)
  • must exceed the long term average daily mean temperature for that calendar day - 5 times standard deviation (calculated for a 5 day window centered on each calendar day over the whole period)

[PyOpenSci] Submit xclim to the Journal of Open Source Software

https://joss.theoj.org/about

The Journal of Open Source Software is an initiative for Research Software Engineers (RSEs) who primarily develop tools to facilitate research to be able to showcase their accomplishments. It also provides an easy method to inform other researchers about Ouranos and xclim (or possibly other work we may do).

The approach I'm suggesting is that once we've finalized enough of the library/docs to merit a 1.0 tagged release, we create a one or two page document describing the library and submit this to the journal. They perform a peer-review process and suggest changes that augment the quality of the docs and utility of the library and, just like a typical journal, if accepted we're provided a DOI, a link on their site, and author credit.

Absolutely not a priority for now but I wanted to open a ticket here so that this is on our radars for maybe Q2 2019.

varia : calendars, time_bnds, ...

Here's a few info/considerations linked to xclim

  • we should take car of passing xarray.DataArray to the different indices and not xarray.Dataset. It may work but there are problems downstream.

  • Nothing get's done with xarray.resample until we actually read the data. To keep in mind for the benchmark.

  • xarray.resample does not support 365_day and 360_day calendars. For a reason i ignore yet (raised an issue about it), using xr.open_dataset() on a 365_day netcdf works (but it should not).

  • if we aim at producing standard netCDF files, we will eventually have to take care of the time_bnds. That implies that we have to compute them and maybe/probably attach them as coordinates to the output xarray.DataArray

Feel free to comment

Agree on a versioning schema

As we start adding features and indices, it would be good to use a semantic versioning (ie: major.minor.patch-build) release system where:

Major = For the finished product with a stable API (ie: Milestone v1.0)
Minor = For additions of modules or maybe changes to indices that likely break API
Patch = For bug fixes and fixes to algorithms that break tests
Build = Either "dev" or "rc" (release-candidate: the version we send up to PyPi)

Bumpversion is a way of managing the version releases almost automatically. @huard have you worked with this tool? AFAIK, the versions are hard coded in the following places:

  • setup.cfg
  • setup.py
  • __init__.py

This shouldn't be too cumbersome to manage. We'll simply need to add the method we agree on to the contributing.rst and bump when needed.

Indices organization

At the moment there are indices in hydro, icclim and xclim which is going to be confusing.

I suggest we put all indices in xclim.py in alphabetical order, and create "fake" modules like icclim to aggregate them into thematic packages.

To be clear, icclim,py would only import indices from xclim.

Developing (with Python 2/3 Compatibility in mind)

@tlogan2000 and I were talking about this and we're wondering how best to proceed.

As mentioned previously, Python2 will be end-of-life soon. Having the library compatible with both for the time being is feasible but that shouldn't mean that we shy away from using Python3 advancements (e.g. F-strings and AsyncIO).

At the moment, we're aiming to use a fair amount of abstraction libraries (xarray, numpy) that will handle most of the errors that can be introduced by Python2/3 math issues (integer division) but we should consider using Python2/3 compatible syntax where possible. So long as we follow good practices and make use of libraries that can convert code base from Py2->2/3 and Py3->2/3, we should be okay.

Suggestion for Contribution guidelines:

  • When adding a new module or set of algorithms, creating a branch of the library, open a Pull Release and ask someone to review changes when adding to Master
  • if the additions are written in Python2, install and run python-futurize (Not python-modernize as we want to make use of more Python3 features) or;
  • if the additions are written in Python3, install and run python-pasteurize
  • Try installing/running the library in conda environments for both Py2 and Py3. (The Travis CI is set to automatically try building the Pull Release when created, so you can opt to simply push changes to your branch after running the Python2/3 compatibility library)
  • Comment/document any known errors in the code or on GitHub so others can jump where needed.

Heat wave frequency seems ambiguous

Our current implementation of heat wave frequency counts the number of events. So if we have a heat wave of 20 consecutive days, it returns 1, while if we have a cold day in between, it will return 2. I think there is high potential for confusion. So my question is: where does the index comes from ? Is this the way it has been implemented ?

Indice Checklist for xclim 1.0 Milestone

In reference to this spreadsheet of ICCLIM indicators and building off of #1

For short term indicator development tracking: Indicator Development Priorities

When a function has satisfied the following parameters, please check them off. An indicator needs:

  • To be coded and placed within an appropriate file (indices.py or others);
  • With a clear docstring that contains parameters;
  • Containing tests (pytests or unittests) that test for desired and undesired behaviour (e.g. negative degree days).

Indicators are as follows:

  • CDD
  • CFD
  • CSDI
  • CSU
  • CW
  • CWD
  • ETR
  • FD
  • GD4
  • HD17
  • ID
  • R10mm
  • R20mm
  • R75p
  • R75pTOT
  • R95p
  • R95pTOT
  • R99p
  • R99pTOT
  • RR1
  • RX1day
  • RX5day
  • SD1
  • SD50cm
  • SD5cm
  • SDII
  • SU
  • TG10p
  • TG90p
  • TN10p
  • TN90p
  • TNn
  • TNx
  • TR
  • TX10p
  • TX90p
  • TXn
  • TXx
  • vDTR
  • WD
  • WSDI
  • WW
  • CD
  • DTR
  • PRCPTOT
  • SD
  • TG
  • TN
  • TX
  • GSL

Please check the boxes or remove them as progress and priorities shift for xclim 1.0.

Create a dictionary of cell methods for indices

Within the variable attributes of NetCDF data the cell methods are a CF-Standard convention for tools and users to better understand pixel values and AFAIK, incorrect cell methods constitute a critical error for PAVICS, among other platforms reading these data sets.

Indices should be set to update the cell methods for each transformation/calculation/aggregation/etc. performed. e.g.:

  • "time: mean over days" --> "time: mean over season" / "mean over year"
  • "time: sum over hour" --> "time mean over year", etc.

One possible solution is to create dictionary of fields that can be set to update when algorithms are called. These definitions could exist in a separate file (e.g. attrs.py).

Solutions for this issue should ideally be combined with #43

Documentation

For each index, I think we'll want a short textual description, a longer explanation including the mathematical formula, an example, and references to papers either defining the indicator or using it.

Percentile threshold indices : Frequencies?

For indices such as 'R95p' (with the following definition taken from icclim)

  • Days with RR > 95th percentile of daily amounts (very wet days)
    (days)

  • Let RRwj be the daily precipitation amount at wet day w (RR ≥ 1.0 mm)
    of period j and let RRwn95 be the 95th percentile of precipitation at wet
    days in the 1961–1990 period. Then counted is the number of days where:

  • RRwj > RRwn95

Does the definition of RRwn95 (the percentile threshold value calculated from the reference period) change with the frequency? i.e if the R95p resampling value is seasonal does the RRwn95 value get calculated seasonally? or is the percentile threshold always annual? ICCLIM docs are unclear on thisI

xr.to_netcdf export changes 'dtype' after summer days calculation

Creating test data sets for the CCSC portal when I came across this. Can someone confirm they get the same error?

description:

  • Calculate summer days (su) (Note using the daily downsampler for now as test datasets have different calendars)
  • Manual add attributes before export
  • data type of 'su' before export is float64 : figure / map of values ok
  • export using to_netcdf
  • 'su' data type of newly exported file is 'timedelta64[ns]'
  • values are large integers (e.g. days since or similar)

Code to reproduce using nrcan daily test data

from xclim.utils import daily_downsampler as dds
import xarray as xr
from glob import os
import numpy as np

xr.set_options(enable_cftimeindex=True)

infile = '/home/travis/PAVICSdev/XCLIMdev/github/tests/testdata/NRCANdaily/nrcan_canada_daily_tasmax_1990.nc'

freqs = ['MS']  # , 'QS-DEC', 'YS']
ds = xr.open_dataset(infile)
su = (ds.tasmax > 25.0+273.15) * 1.0
for f in freqs:

    grouper = dds(su, f)
    output = grouper.sum(dim='time')
    time1 = dds(ds.time, freq=f).first()

    ds.coords['counter'] = ds.time
    ds.counter.values = np.ones(ds.time.shape)

    output.coords['time'] = ('tags', time1.values)
    output = output.swap_dims({'tags': 'time'})
    output = output.sortby('time')
    output.attrs = ds.tasmax.attrs
    output.attrs['units'] = 'days'
    output.attrs['standard_name'] = 'summer_days'
    output.attrs['long_name'] = 'summer days'
    output.attrs['description'] = 'Number of days where daily maximum temperature exceeds 25℃'
    output = output.drop(['tags'])

    # use original file as template
    ds1 = xr.open_dataset(infile, drop_variables=['tasmax', 'time','time_vectors','ts'])
    ds1.coords['time'] = output.time.values
    ds1['su'] = output

    comp = dict(zlib=True, complevel=5)
    encoding = {var: comp for var in ds1.data_vars}
    print(os.path.basename(infile).replace('.nc', '_SummerDays-' + f) + ' : writing ' + f + ' to netcdf')
    # with dask.config.set(pool=ThreadPool(4)):

    ds1.to_netcdf('~/testNRCANsummerdays.nc', format='NETCDF4', encoding=encoding)

    ds2 = xr.open_dataset('~/testNRCANsummerdays.nc')
    print(ds1)
    print(ds2)
    print(ds1.su.max())
    print(ds2.su.max())

Testing Procedures

I stumbled upon a much better way to perform tests on local machines. I didn't realize that tox could be run locally. tox is a virtualenv-based testing environment that the Travis CI runs to verify build success. By ensuring that you have tox installed in your system python (for Ubuntu: apt-get install tox) as well as within your conda environment (conda install tox), running tox as a Python Testing configuration (in PyCharm) will trigger the build-tests on all available Python environments for your machine. An example:

screenshot from 2018-09-27 09-57-32

Errors reports are divided by build so you'll be able to see where your errors are and what builds they are on. For reference, the configured builds are py27, py34, py35, py36, flake8, and docs

Document development steps

Add to documentation steps to follow in climate indicator development

  1. Github clone, branch
  2. Create unittest
  3. push, pull request

Standard_name check

Description

Make sure that standard_name for our output is on the official list.

Update NetCDF History information

One of the useful things that bash utilities like ncks and cdo do is they append comments to the history field of outputted netcdf files the processes that have been performed on the files. For example:

:Conventions = "CF-1.6" ;
		:history = "Sun Jul 29 07:19:00 2018: cdo -O -f nc sellonlatbox,-96,-72,39,52 /<servername>/smith/scripts/eccc_07132018/crcm5_data/bbw/psl/psl_bbw_199701_se.nc /<servername>/smith/scripts/eccc_07132018/cdo_subset/bbw/psl/psl_bbw_199701_se_ECCCsubset.nc\n"

It would be good to add a function that on exit or on write success adds a history entry that specifies:

  • The date that the file was modified
  • The xclim function performed
  • The filepaths (I/O)

DataArray attributes

Description

When coding index functions using the daily_downsampler make sure to use 'keep_attrs=True' or to create appropriate attrs (minimally suggest : units, long_name, standard_name) as an ordered dictionary

Example:
from xclim.utils import daily_downsampler as dds
grouper = dds(da, freq=freq)
output = grouper.max(dim='time',keep_attrs=True)

In cases where the index calculation results in a change of units (or both units and name attrs) they should be modified (as an ordered dictonary) before returning the dataarray.

E.g. Calculation of growing season length from temperature data would change all attributes (units, long_name, standard_name)

Required units for index computations

Description

To support unit conversion (with ocgis or not), we'll need functions to be able to tell which units they are expecting (a required_units attribute).

This underlines again the tension between having simple basic index functions, and all the bells and whistles (output metdata, cf-compliance, unit checks) that are necessary to integrate these functions into WPS processes and scientific workflows.

So far, we've been able to put all this into decorators, but I'm not sure it will scale well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.