GithubHelp home page GithubHelp logo

using-python-webinar's Introduction

Using Python on JASMIN: Webinar Example Scripts

A selection of scripts to serve as an example of some of the things that you can do on JASMIN.

These scripts have been tested with specific paths and will work on the supplied example paths. As each dataset is different, you will need to modify the code to use some of these scripts in other places.

These should serve as a base to give some examples which you can use.

Creating the environment

These scripts were created with a newer version of Xarray. This means in order to run them, you will need to create a Python3 virtual environment. For convenience, there is a create-env.sh script included with the repo which should make this process easy.

./create-env.sh

NOTE: you only need to run the above script once.

Setting the environment

Each time you login to a new session and you want to run any of the scripts you will need to set the environment with:

module load jaspy
source venv/bin/activate

Using Pandas to process CSV files

Pandas is a really powerful library for creating and manipulating data tables. With Pandas you can easily read in CSV files, do some processing on them and visualise them.

This example uses rainfall data from the UK Met Office Midas Open dataset.

The headers are ignored and the data is read into a Pandas DataFrame.

Example path: /badc/ukmo-midas-open/data/uk-hourly-rain-obs/dataset-version-201908/oxfordshire/00605_brize-norton/qc-version-1

Usage:

python csv_pandas.py /badc/ukmo-midas-open/data/uk-hourly-rain-obs/dataset-version-201908/oxfordshire/00605_brize-norton/qc-version-1
usage: csv_pandas.py [-h] [-o OUTPUT] directory

Generate a plot of yearly, max, mean and min from a series of csv files in the midas open precipitation timeseries
positional arguments: directory Directory containing csv files
optional arguments: -h, --help show this help message and exit -o OUTPUT, --output OUTPUT Directory to output the graph, defaults to the run directory. Default: [.]

Using Xarray to extract timeseries from netCDF

Xarray uses Dask on the backend to parallelise operations and speed up the workflow. You can use Xarray to work with NetCDF files and extract specific regions and do some processing.

This example uses Xarray to read a timeseries of NetCDF files, extract the UK region and calculate the annual mean temperature for each grid box. The result is then written to a new NetCDF file.

Example path: /badc/cmip5/data/cmip5/output1/BCC/bcc-csm1-1/amip/mon/atmos/Amon/r1i1p1/latest/tas

Usage:

python netcdf_xarray.py /badc/cmip5/data/cmip5/output1/BCC/bcc-csm1-1/amip/mon/atmos/Amon/r1i1p1/latest/tas
usage: netcdf_xarray.py [-h] [-o OUTPUT] directory

Extract a time series of annual surface temperature over the UK
positional arguments: directory Directory containing source files
optional arguments: -h, --help show this help message and exit -o OUTPUT, --output OUTPUT Directory to output the netcdf file, defaults to the run directory. Default [.]

Using Xarray and matplotlib to plot data

Xarray can also be used with matplotlib to plot data directly. This can be used to visualise the data during analysis or as an output.

This example uses xarray to extract a region from a dataset with a specific timestep and plot the wind variable.

Example path: /badc/ecmwf-era-interim/data/wa/as/2017/04/04

Usage:

python data_visualisation.py /badc/ecmwf-era-interim/data/wa/as/2017/04/04 --bbox 70 40 20 -20
usage: data_visualisation.py [-h] [--timestep TIMESTEP]
                             [--bbox COORDINATE COORDINATE COORDINATE COORDINATE]
                             directory

Extract and area and timestamp and plot
positional arguments: directory Directory containing source files
optional arguments: -h, --help show this help message and exit --timestep TIMESTEP Options: 0000 0600 1200 1800 --bbox COORDINATE COORDINATE COORDINATE COORDINATE Format: N,S,E,W

Using python to get a list of files which match your requirements

Python has a suite of useful filepath manipulation tools included with the standard library such as os and glob.

The filesystem on JASMIN contains useful metadata about the files at the end of the hierarchy. For example the path /neodc/esacci/sea_ice/data/sea_ice_thickness/L2P/envisat/v2.0/NH/2012/01 contains useful metadata and is of the format:

/neodc/esacci/sea_ice/data/<variable>/L2P/envisat/v2.0/<hemisphere>/<year>/<month>/*.nc

This example script will start in the directory supplied then proceed to give you a series of options as to which directory you wish to take next or even all of them. You can then either put the output into a file or print to the terminal. Before outputting your files, the script will display the glob pattern to get you files using a linux command.

Example path: /neodc/esacci/sea_ice/data/

Usage:

python file_listing.py /neodc/esacci/sea_ice/data/
usage: file_listing.py [-h] [-o OUTPUT] directory

Extract and area and timestamp and plot
positional arguments: directory Start directory
optional arguments: -h, --help show this help message and exit -o OUTPUT, --output OUTPUT Output list of desired files

using-python-webinar's People

Contributors

agstephens avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.