GithubHelp home page GithubHelp logo

esmvaltool_sample_data's People

Contributors

bouweandela avatar stefsmeets avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

esmvaltool_sample_data's Issues

Support tests for multimodel statistics preprocessor function

To support testing the multimodel statistics preprocessor function with the data from this repository, the following still needs to be done

Importable data loader #3:

  • A function for loading the data conveniently, e.g. load_timeseries() which returns a cubelist (use cube_helper.load for loading the data?)
  • Make installable module esmvaltool_sample_data

Download script in #4:

  • Replace the . in the paths with a / for all subdirectories of esmvaltool_sample_data/data/timeseries
  • Download all timeseries datasets, but only save small ones (e.g. skip AWI data for now, because they save one file per year and this becomes big even with subsetting)
  • Improve download script so the subsetting is done for equal geographical regions/vertical levels instead of just by slicing the first 2 steps
  • Select also an ocean variable for subsetting (item 2 in #1), e.g. with a daily frequency (if multimodel statistics supports it) -> this needs proper support from iris first
  • Clean up the script for downloading data (now in esmvaltool_sample_data/sample_data.py)

Add metadata #11 :

  • Add license information
  • Improve README
  • Improve CONTRIBUTING
  • Update setup.py

Other:

  • #9 Rename download script to download_sample_data.py and move it to root

Populate repository with data

Here are the list of datasets I think would be nice, all from CMIP6, because it has a good license, for all models that provide the variable

  • Timeseries data (#4)

    • ta / Amon / historical / r1i1p1f1, any grid, 1850 - onwards, all dimensions reduced to a few steps except for the time dimension
    • some other variable / ocean, probably a different frequency, similar number of timesteps, other dimensions reduced
  • Map data

    • a 4D atmospheric variable, all dimensions reduced to a few steps except the horizontal dimension(s)
    • same for an ocean variable
  • Profile data

    • a 4D atmospheric variable, all dimensions reduced to a few steps except the vertical dimension(s)
    • same for an ocean variable

To get a lot of variation, it might be nice to use different variables/experiments/frequencies for all 6 sets listed above.

Some datasets have deviating plev coordinates

A few datasets have deviating vertical coordinates of [100000.00000001 92500.00000001]:

./esmvaltool_sample_data/data/timeseries/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r1i1p1f1/Amon/ta/gn/v20191115       
./esmvaltool_sample_data/data/timeseries/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/Amon/ta/gn/v20191108

and [100000.00000001 85000.00000001]:

./esmvaltool_sample_data/data/timeseries/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r1i1p1f1/day/ta/gn/v20191115
./esmvaltool_sample_data/data/timeseries/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/day/ta/gn/v20191108

Addressing this issue is needed for:
ESMValGroup/ESMValCore#956

Some datasets have multiple versions

Some datasets have multiple versions, like the one below.

│   │               ├── CCCma
│   │               │   └── CanESM5
│   │               │       └── historical
│   │               │           └── r1i1p1f1
│   │               │               ├── Amon
│   │               │               │   └── ta
│   │               │               │       └── gn
│   │               │               │           ├── v20190306
│   │               │               │           └── v20190429
│   │               │               └── day
│   │               │                   └── ta
│   │               │                       └── gn
│   │               │                           ├── v20190306
│   │               │                           └── v20190429

Add unit tests for loader functions

I use this little snippet for testing the data in the __init__.py file. It would be a good idea to build some basic unit tests around it.

    VERBOSE = True

    for mip_table in (
            'Amon',
            'day',
    ):
        print()
        print(f'Loading `{mip_table}`')
        ts = load_timeseries_cubes(mip_table)

        first_cube = ts[0]
        for i, cube in enumerate(ts):
            print(i)
            cube.regrid(grid=first_cube, scheme=iris.analysis.Linear())

How to use this dataset in ESMValTool projects?

The readme gives some hints to use this dataset in your ESMValTool projects by adding the rootpath to the config-user.yml.

The instructions are unclear and incomplete. The drs specification seems to be BADC / ETHZ, but I am unable to get it working following the instructions.

Dataset contains very large unmasked value

One of the datasets has a very large unmasked value (5.813931e+36), which causes issues with the multimodel statistics.

# Loading #15: esmvaltool_sample_data/data/timeseries/CMIP6/CMIP/E3SM-Project/E3SM-1-1/historical/r1i1p1f1/Amon/ta/gr/v20191211
#     cube.shape=(780, 2, 2, 2) cube.data.min()=235.27495 cube.data.max()=5.813931e+36
#     cube.coord("time").units.calendar='365_day'

Datasets have different calendars / shapes

The datasets have different lengths of the time axis, different calendars, and different horizontal axes. This makes them difficult to group and work with.

  • CMIP6/CMIP/NIMS-KMA/KACE-1-0-G/historical/r1i1p1f1/Amon/ta/gr/v20191028
    shape: (1980, 2, 2, 1), calendar: 360_day
  • CMIP6/CMIP/KIOST/KIOST-ESM/historical/r1i1p1f1/Amon/ta/gr1/v20191104
    shape: (1980, 2, 2, 1), calendar: 365_day
  • CMIP6/CMIP/NCAR/CESM2-WACCM/historical/r1i1p1f1/Amon/ta/gn/v20190227
    shape: (1980, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/NASA-GISS/GISS-E2-1-H/historical/r1i1p1f1/Amon/ta/gn/v20190403
    shape: (1368, 2, 1, 1), calendar: 365_day
  • CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r1i1p1f1/Amon/ta/gn/v20191115
    shape: (780, 2, 2, 1), calendar: proleptic_gregorian
  • CMIP6/CMIP/NOAA-GFDL/GFDL-ESM4/historical/r1i1p1f1/Amon/ta/gr1/v20190726
    shape: (780, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/CMCC/CMCC-CM2-SR5/historical/r1i1p1f1/Amon/ta/gn/v20200616
    shape: (780, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/ta/gn/v20190429
    shape: (1980, 2, 2, 1), calendar: 365_day
  • CMIP6/CMIP/BCC/BCC-CSM2-MR/historical/r1i1p1f1/Amon/ta/gn/v20181126
    shape: (1020, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/INM/INM-CM4-8/historical/r1i1p1f1/Amon/ta/gr1/v20190605
    shape: (780, 2, 1, 2), calendar: 365_day
  • CMIP6/CMIP/NCC/NorESM2-MM/historical/r1i1p1f1/Amon/ta/gn/v20191108
    shape: (780, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/E3SM-Project/E3SM-1-1/historical/r1i1p1f1/Amon/ta/gr/v20191211
    shape: (780, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/CAS/FGOALS-f3-L/historical/r1i1p1f1/Amon/ta/gr/v20190927
    shape: (780, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/NASA-GISS/GISS-E2-1-G/historical/r1i1p1f1/Amon/ta/gn/v20180827
    shape: (1368, 2, 1, 1), calendar: 365_day
  • CMIP6/CMIP/MPI-M/MPI-ESM1-2-HR/historical/r1i1p1f1/Amon/ta/gn/v20190710
    shape: (780, 2, 2, 3), calendar: proleptic_gregorian
  • CMIP6/CMIP/AS-RCEC/TaiESM1/historical/r1i1p1f1/Amon/ta/gn/v20200623
    shape: (1980, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/INM/INM-CM5-0/historical/r1i1p1f1/Amon/ta/gr1/v20190610
    shape: (780, 2, 1, 2), calendar: 365_day
  • CMIP6/CMIP/NASA-GISS/GISS-E2-1-G-CC/historical/r1i1p1f1/Amon/ta/gn/v20190815
    shape: (1368, 2, 1, 1), calendar: 365_day
  • CMIP6/CMIP/CAS/FGOALS-g3/historical/r1i1p1f1/Amon/ta/gn/v20190818
    shape: (804, 2, 1, 2), calendar: 365_day
  • CMIP6/CMIP/NUIST/NESM3/historical/r1i1p1f1/Amon/ta/gn/v20190630
    shape: (1980, 2, 1, 2), calendar: gregorian
  • CMIP6/CMIP/CAMS/CAMS-CSM1-0/historical/r1i1p1f1/Amon/ta/gn/v20190708
    shape: (900, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/CMCC/CMCC-CM2-HR4/historical/r1i1p1f1/Amon/ta/gn/v20200904
    shape: (780, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/ta/gn/v20190308
    shape: (1980, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/CCCR-IITM/IITM-ESM/historical/r1i1p1f1/Amon/ta/gn/v20191226
    shape: (780, 2, 1, 2), calendar: julian
  • CMIP6/CMIP/E3SM-Project/E3SM-1-0/historical/r1i1p1f1/Amon/ta/gr/v20191220
    shape: (780, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/AWI/AWI-CM-1-1-MR/historical/r1i1p1f1/Amon/ta/gn/v20181218
    shape: (780, 2, 2, 3), calendar: proleptic_gregorian
  • CMIP6/CMIP/THU/CIESM/historical/r1i1p1f1/Amon/ta/gr/v20200417
    shape: (1980, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/historical/r1i1p1f1/Amon/ta/gr1/v20180701
    shape: (780, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/FIO-QLNM/FIO-ESM-2-0/historical/r1i1p1f1/Amon/ta/gn/v20191204
    shape: (780, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/CCCma/CanESM5/historical/r1i1p1f1/Amon/ta/gn/v20190306
    shape: (1980, 2, 2, 1), calendar: 365_day
  • CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Amon/ta/gr/v20180803
    shape: (1980, 2, 2, 1), calendar: gregorian
  • CMIP6/CMIP/NCC/NorCPM1/historical/r1i1p1f1/Amon/ta/gn/v20200724
    shape: (1980, 2, 2, 1), calendar: 365_day
  • CMIP6/CMIP/HAMMOZ-Consortium/MPI-ESM-1-2-HAM/historical/r1i1p1f1/Amon/ta/gn/v20190627
    shape: (780, 2, 1, 2), calendar: proleptic_gregorian
  • CMIP6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Amon/ta/gn/v20190710
    shape: (780, 2, 1, 2), calendar: proleptic_gregorian
  • CMIP6/CMIP/CAS/CAS-ESM2-0/historical/r1i1p1f1/Amon/ta/gn/v20200502
    shape: (1980, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/MIROC/MIROC6/historical/r1i1p1f1/Amon/ta/gn/v20190311
    shape: (780, 2, 1, 2), calendar: gregorian
  • CMIP6/CMIP/MRI/MRI-ESM2-0/historical/r1i1p1f1/Amon/ta/gn/v20190308
    shape: (780, 2, 2, 2), calendar: proleptic_gregorian
  • CMIP6/CMIP/SNU/SAM0-UNICON/historical/r1i1p1f1/Amon/ta/gn/v20190323
    shape: (780, 2, 3, 2), calendar: 365_day
  • CMIP6/CMIP/NCC/NorESM2-LM/historical/r1i1p1f1/Amon/ta/gn/v20190815
    shape: (780, 2, 2, 1), calendar: 365_day
  • CMIP6/CMIP/E3SM-Project/E3SM-1-1-ECA/historical/r1i1p1f1/Amon/ta/gr/v20200624
    shape: (780, 2, 2, 2), calendar: 365_day
  • CMIP6/CMIP/BCC/BCC-ESM1/historical/r1i1p1f1/Amon/ta/gn/v20181217
    shape: (1980, 2, 2, 1), calendar: 365_day
  • CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/Amon/ta/gn/v20191108
    shape: (780, 2, 2, 1), calendar: proleptic_gregorian
  • CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3/historical/r1i1p1f1/Amon/ta/gr/v20200310
    shape: (780, 2, 3, 3), calendar: proleptic_gregorian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.