GithubHelp home page GithubHelp logo

old_climsim_feedstock's Introduction

proto_feedstock

The prototype (and future template) of a LEAP-Pangeo feedstock.

Setup

Use this template

  • Click on the button on the top left to use this repository as a template for your new feedstock
image

Important

  • Make the repo public
  • Make sure to create the repo under the leap-stc github organization, not your personal account!
  • Name your feedstock according to your data <your_data>_feedstock.

If you made a mistake here it is not a huge problem. All these settings can be changed after you created the repo.

  • Now you can locally check out the repository.

Note

The instructions below are specific for testing recipes locally but downloading and producing data on GCS cloud buckets. If you are running the recipes locally you have to minimally modify some of the steps as noted below.

Build and test your recipe locally on the LEAP-Pangeo Jupyterhub

  • Edit the feedstock/recipe.py to build your pangeo-forge recipe. If you are new to pangeo-forge, the docs are a great starting point
  • Make sure to also edit the other files in the /feedstock/ directory. More info on feedstock structure can be found here

Test your recipe locally

Before we run your recipe on LEAPs Dataflow runner you should test your recipe locally.

You can do that on the LEAP-Pangeo Jupyterhub or your own computer.

  1. Set up an environment with mamba or conda:
mamba create -n runner0102 python=3.11 -y
conda activate runner0102
pip install pangeo-forge-runner==0.10.2 --no-cache-dir
  1. You can now use pangeo-forge-runner from the root directory of a checked out version of this repository in the shell
pangeo-forge-runner bake \
  --repo=./ \
  --Bake.recipe_id=<recipe_id>\
  -f configs/config_local_hub.py

Note

Make sure to replace the 'recipe_id' with the one defined in your feedstock/meta.yaml file.

If you created multiple recipes you have to run a call like above for each one.

To run this fully local (e.g. on your laptop) you have to replace config_local_hub.py with config_local.py.

โš ๏ธ This will save the cache and output to a subfolder of the location you are executing this from.. Make sure do delete them once you are done with testing.

  1. Check the output! If something looks off edit your recipe.

Tip

The above command will by default 'prune' the recipe, meaning it will only use two of the input files you provided to avoid creating too large output. Keep that in mind when you check the output for correctness.

Once you are happy with the output it is time to commit your work to git, push to github and get this recipe set up for ingestion using Google Dataflow

Activate the linting CI and clean up your repo

Pre-Commit linting is already pre-configured in this repository. To run the checks locally simply do:

pip install pre-commit
pre-commit install
pre-commit run --all-files

Then create a new branch and add those fixes (and others that were not able to auto-fix). From now on pre-commit will run checks after every commit.

Alternatively (or additionally) you can use the pre-commit CI Github App to run these checks as part of every PR. To proceed with this step you will need assistance a memeber of the LEAP Data and Computation Team. Please open an issue on this repository and tag @leap-stc/data-and-compute and ask for this repository to be added to the pre-commit.ci app.

Deploy your recipe to LEAPs Google Dataflow

Warning

To proceed with this step you will need to have certain repository secrets set up. For security reasons this should be done by a memeber of the LEAP Data and Computation Team. Please open an issue on this repository and tag @leap-stc/data-and-compute to get assistance.

To deploy a recipe to Google Dataflow you have to trigger the "Deploy Recipes to Google Dataflow" with a single recipe_id as input.

Add your dataset to the LEAP-Pangeo Catalog

Now that your awesome dataset is available as an ARCO zarr store, you should make sure that everyone else at LEAP can check this dataset out easily.

TBW: Instructions how to edit feedstock/catalog.yaml

old_climsim_feedstock's People

Contributors

jbusecke avatar

Watchers

Ryan Abernathey avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.