scottwales / climtas Goto Github PK
View Code? Open in Web Editor NEWClimate Extremes Timeseries Analysis
Home Page: https://climtas.readthedocs.io/
Climate Extremes Timeseries Analysis
Home Page: https://climtas.readthedocs.io/
Since the latest update to the stable conda module, we got two errors with climtas storing files, here's one:
Traceback (most recent call last):
File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.04/bin/era5grib", line 10, in <module>
sys.exit(main())
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/era5grib/era5grib.py", line 341, in main
func(**dargs)
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/era5grib/era5grib.py", line 177, in era5grib_wrf
save_grib(ds, output, format=format)
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/era5grib/era5grib.py", line 55, in save_grib
climtas.io.to_netcdf_throttled(ds, tmp_compressed)
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/climtas/io.py", line 89, in to_netcdf_throttled
f = ds.to_netcdf(str(path), encoding=encoding, compute=False)
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/core/dataset.py", line 1901, in to_netcdf
return to_netcdf(
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/backends/api.py", line 1072, in to_netcdf
dump_to_store(
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/backends/api.py", line 1119, in dump_to_store
store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/backends/common.py", line 265, in store
self.set_variables(
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/backends/common.py", line 303, in set_variables
target, source = self.prepare_variable(
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 481, in prepare_variable
encoding = _extract_nc4_variable_encoding(
File "/g/data/hh5/public/apps/miniconda3/envs/analysis3-22.04/lib/python3.9/site-packages/xarray/backends/netCDF4_.py", line 277, in _extract_nc4_variable_encoding
raise ValueError(
ValueError: unexpected encoding parameters for 'netCDF4' backend: ['szip', 'zstd', 'bzip2', 'blosc']. Valid encodings are: {'fletcher32', 'chunksizes', 'complevel', 'least_significant_digit', 'shuffle', 'contiguous', 'zlib', '_FillValue', 'dtype'}
Saving into one file per year should allow for more parallelisation
Add support for cftime calendars to groupby
Add a function that selects values between some longitude range, wrapping the source data if the meridian is within the range
Blocked_resample should be able to check the time axis and work out the correct interval for daily data
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-7caa70c0c441> in <module>
1 ds = air_daily.to_dataset()
2 # Write to netcdf
----> 3 climtas.io.to_netcdf_throttled(ds,"GSWP3_365_Tair.nc")
4 #ds.to_netcdf("GSWP3_365_Tair.nc",encoding={"temp":{"chunksizes":(360,720,100)}})/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/climtas/io.py in to_netcdf_throttled(ds, path, complevel, max_tasks, show_progress)
101 # list
102 for k, v in old_graph.items():
--> 103 if v[0] == dask.array.core.store_chunk:
104 store_keys.append(k)
105 new_graph[k] = None # Mark the task done in new_graphValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
See /g/data/w35/ccc561/temp/Lina_942/climdata/GSWP3-forScott.ipynb
Currently events going past the last timestep are marked as NaT in event_coords()
, as it's not keeping track of how long each timestep is.
If the time axis is regular we can calculate the actual event duration.
Is there or does there need to be a way of closing climtas once a client has been started and run?
Eg, I get a message such as:
2023-01-16 09:41:30,695 - distributed.diskutils - INFO - Found stale lock file and directory '/local/p66/rb4844/tmp/dask-worker-space/worker-9nmgxv02', purging
which implies I have an old client still running? Do I need to tell climtas to shut down once a job is completed?
Thanks, Roger
Hi @ScottWales
@navidcy just brought my attention to this video, which might help reduce excess memory use in Dask: https://youtu.be/nwR6iGR0mb0
Would something like environ={"MALLOC_TRIM_THRESHOLD_": "65536"}
(see 4:40 in the video) be good to include in
climtas.nci.GadiClient?
Disclaimer: I'm often hitting memory limits in my calculations, but I haven't tested this fix myself.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.