darpa-askem / beaker-climate-data-utility Goto Github PK

Climate Data Utility Plugin for Beaker tool.

Dockerfile 2.99% Python 96.71% Jupyter Notebook 0.29%

beaker-climate-data-utility's Introduction

Beaker Climate Data Utility

About

Beaker Climate Data Utility is a Beaker module designed to handle geographical/climate data. It provides a set of tools for detecting the resolution of a dataset, which can be particularly useful when regridding a dataset or when the resolution is unknown.

Before and After regridding operations:

Startup

To use Beaker Climate Data Utility, follow these steps:

Navigate to the project directory: cd beaker-climate-data-utility
Create a .env file and fill it out with the necessary environment variables. Refer to the envfile.sample file for the required variables.
Launch the container using Docker Compose: docker-compose up -d
Access the Beaker Climate Data Utility application at http://localhost:8888/dev_ui?

You can view the logs with docker logs -f beaker-climate-data-utility-jupyter-1

Usage

Once in the user interface, you can use the notebook like a regular Jupyter notebook. There is also an LLM assistant you can use to access special features in the notebook.

The kernel can interface with dataset storage on the HMI server through custom messages.

To retrieve a dataset from the HMI server, send a custom message named download_dataset_request with the payload:

{
    "uuid":"<your id here>",
    "filename":"<your filename here>"
}

This dataset will be loaded into the jupyter notebook environment as an xarray dataset called dataset.

To save a dataset to the HMI server, you need to send a custom message named save_dataset_request and provide the name of the xarray dataset variable you want to save with a payload:

{
    "dataset":<your dataset variable name here>,
    "filename":"<your chosen filename for the persisted data here>"
}

This will return a dataset uuid from the HMI server with your new dataset.

You can request the LLM to provide you plotting code in order to preview netcdf files. The LLM will ask for a variable name in the notebook, and if you have any particular geographical column names, a data variable name, and a time slice index. There are defaults for latter 3 values.

You can request the LLM to provide regridding code in order to regrid a netcdf dataset.

beaker-climate-data-utility's People

Contributors

Watchers

beaker-climate-data-utility's Issues

Support for downscaling

This issue is to migrate this repo/context into darpa-askem/beaker-kernel as its own context. The Dockerfile might need to be slightly updated to support the relevant dependencies but otherwise it should basically be compatible in that environment.

Then--we need to update the regridding function to be more flexible based on this thread.

We also need to ensure that the regridding/downscaling tool can check for _bnds type variables and exclude them or drop them out of the process as they seem to cause problems.

Basic testing

set up this repo based on the README
navigate to localhost:8888
select the climate data utility beaker context and load it
send a custom message of type download_dataset_request with the payload {"uuid": "149efd94-dc91-4673-9245-443ad61276ea", "filename": "cmip6-6e565455-9589-4d7c-8960-18c47ed6b9b7.nc"}
watch logs, check that dataset is an xarray variable in the notebook

Now, after migrating this context to the DARPA-ASKEM/beaker-kernel you should be able to follow the same steps if the migration of the context worked as expected.

Next steps

try regridding the dataset to 2 degree resolution with a mean aggregation and plot the before/after results. The resulting plot should be more coarse grained than the "before"
try downscaling going to 0.1 degree resolution via interpolation. check the before/after as well by plotting.