GithubHelp home page GithubHelp logo

cmip6-dreq-interface's Introduction

CMIP6 DREQ interfaces

djq

djq is the DREQ JSON Query tool which allows you to query the DREQ for variable mappings, and also provides an interface to explore algorithms to implement such mappings. You can hand it requests specifying MIPs and experiments within those MIPs and it will reply with lists of variables. You should be able to install it from its setup.py. It requires the dreqPy package from the DREQ (but is not fussy about version), nose to run tests, and will also need access to the a SVN checkout of the DREQ itself: you will almost certainly need to teach it where this is.

Because the mapping from MIPs and experiments to variables is not very well-defined by the DREQ, djq also provides a simple interface which allows you to define your own mapping function: this function doesn't need to know anything about djq at all other than how it is called, and can be loaded dynamically from a module at run time, so no modifications are needed to djq in order to provide an alternative mapping function . It is possible to specify which function or module to use both from the Python API and from the command line. Multiple such functions / modules can exist concurrently. These functions / modules are called 'implementations' in the code.

Once the set of variables is computed, it needs to be elaborated in various ways before being turned into JSON. This is done by a 'JSONifier', and these are also components which can be plugged in to djq dynamically.

A tool is included, cci, which allows you to directly compare implementations.

There are four command line programs associated with djq.

  • djq is the main thing: you can use it to make queries and get answers;
  • cci compares implementations -- you can give it the names of one or two implementation modules and will tell you whether they differ in the variables they compute, giving a metric of similarity going from 0.0 (completely different) to 1.0 (identical);
  • all-requests is a little program which reads the DREQ, and then generates a request for every experiment in every MIP, which can be fed to djq to run a really comprehensive set of queries;
  • scatter-replies is another little program which will read a set of replies from djq and spit them out into files named after the MIPs and experiments.

As an example, using cci to compare the two implementations bundled with djq, which correspond to what Martin described in his document and what he used at to generate his spreadsheets, both at the time djq was originally written, the results vary from 0.0 to 1.0 for different pairs of MIP and experiment.

dqi

dqi is the DREQ Query Interface. You do not need this to use djq, although some djq back ends may need it (they do not currently).

small

small contains some small, more-or-less ad-hoc programs which are related to the CMIP6 DREQ. These are almost entirely undocumented and may or may not work.

See the change log, which contains at least an entry for each release, and often also any changes which matter since the most recent release.

Pointers

References

Builds and tests

If you use Travis CI, you should be able to persuade it to run tests: look at at .travis.yml. These previously worked, but Travis CI needs access to the repo to do them, and it doesn't have access to the Met Office repo (nor should it, I think).

The tests should pass unless there are serious bugs in djq itself. It formerly ran a hairy sanity test which compared what it computes against spreadsheets included with the DREQ. Unfortunately the DREQ is so unstable that these essentially never passed, so these are no longer included in the Travis CI tests.

Browsing the documentation

All the documentation is in Markdown (and specifically GitHub flavoured Markdown): this should be fairly readable as plain text. README.md files are the entry points, and any extended documentation is in subdirectories called doc.

You can use grip to view the pretty version of the documentation locally. It can be installed with pip:

$ pip install grip
[...]
$ grip
 * Running on http://localhost:6419/ (Press CTRL+C to quit)

grip -b is also useful (opens a browser tab).

See its documentation. Note that grip works by using GitHub's API to format the markdown files, and so sends their content to GitHub: it's not suitable for sensitive data.


© British Crown Copyright 2016, 2018, Met Office. See LICENSE.md for license details.

cmip6-dreq-interface's People

Contributors

tfeb avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

uk-gov-mirror

cmip6-dreq-interface's Issues

Publishing administrivia

  • Sort out with MO people if I can / should.
  • Decide where it should live (my repo? probably not):
    • Rehost it if need be.
    • Sort out MO membership & groups
  • License and copyright:
    • suitable copyright headers;
    • decide on a license;
    • add it to the top-level with suitable pointers.

JSONify sorting should be more clever

Currently the output of a JSONifier gets sorted based on a 'label' in the dicts it returns. But alternative JSONifiers might not return suitable dicts. So one of the following should be done:

  • it should be documented that they must do so;
  • the thing should check and not sort if it can't find a label field;
  • there should be a way of specifying what the key is and whether to sort.

I think one of the first or second options is probably best: the third would be a DTRT choice but would make the implementation stupidly hairy.

Simplify / reduce the use of fluids (dynamic scope) in djq

The program makes quite a lot of use of things it calls 'fluids', which implement dynamically-scoped bindings. While they have all the advantages of dynamically-scoped bindings over globals, they also have all the disadvantages of them over lexically-scoped bindings. I need to understand which ones are needed and simplify / remove the ones that aren't. I suspect that the root / tag / path (new) fluids should go.

This is quite a big change.

Initial release preparation meta

Things that need to be done

  • Clean up branches at MO: dqi-development needs to go or be recreated at least.
  • Release notes
  • Merge to master and push master to GH and hence home.
  • Decide on a release tagging scheme: it it's not YYYMMDD then this needs to be justified.
  • Tag it.
  • Push.

Some of these mean that the release can't happen until I can fix things on the MO repo: so it will need to be Monday, not Friday.

Tools for generating flat files

There need to be two scripts to do this:

  • a script to create a request which includes every experiment of every mip;
  • a script which will take a single big response to such a request and spit it out into flat files per single-request

Documentation for initial release

This is everything that needs to be done so that other people could (perhaps) understand it.

  • Document the sorting behaviour of JSONifiers (this is #2).
  • Provide provenance for the implementations.
  • Documentation (at least some) for command line tools.
  • Python sample documentation
  • Read it reasonably carefully.

Additionally any other cleanups that are needed.

JSONify checks should be per implementation

Currently there's just one big check tree for JSONifiers, which means that you can't write simple ones which don't pass the checks.

This should be fixed by making checkers be per-implementation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.