Comments (6)
From the stack trace, it appears that the metadata field, scaling_factor
, for the streamflow
variables in one of the NWM's channel_rt output files is not being deserialized as a collection (list, etc.) and instead is just a scalar variable (int, float, etc.). This may have been caused by a downstream change to a dependency (xarray, h5netcdf).
from hydrotools.
I was able to resolve this issue by removing the index in the scale_factor
object.
line 274 python/nwm_client/src/hydrotools/nwm_client/gcp.py
# Extract scale factor
scale_factor = ds['streamflow'].scale_factor[0]
# fixed with
scale_factor = ds['streamflow'].scale_factor
I am assuming that the metadata layout of NWM channel route link files is pretty static over time as we've not seen this issue before. I assume this is a deserialization issue propagating from, if I had to guess, xarray.
It might be best if we push a hot fix that guards and type checks the scale_factor
field while we track down and figure out what is causing this and determine a long term solution.
from hydrotools.
Found the issue. It is propagatingh5netcdf
. Today they pushed 0.14.0
which introduced the following per their change log.
Return items from 0-dim and one-element 1-dim array attributes. Return multi-element attributes as lists. Return string attributes as Python strings decoded from their respective encoding (utf-8, ascii). By Kai Mühlbauer.
I verified that rolling the version back to 0.13.0
resolved this issue.
from hydrotools.
Now as to how we should proceed. I know previously I said:
It might be best if we push a hot fix that guards and type checks the scale_factor field while we track down and figure out what is causing this and determine a long term solution.
In this case, I think it makes sense to just type check ds.streamflow.scale_factor
and handle the case where a scalar is returned. I dont want to force others to comply with a version pinning of h5netcdf. Thoughts @jarq6c?
proposed solution
streamflow = ds['streamflow']
# h5netcdf <= 0.13.0 always deserializes numeric attributes to numpy arrays.
# even if there will only be one item in the array.
if isinstance(streamflow.scale_factor, np.ndarray):
scale_factor = streamflow.scale_factor[0]
# h5netcdf > 0.13.0 deserializes numeric attributes to numpy arrays if there is more than scalar in the attribute.
# otherwise, a scalar numpy value is returned
else:
scale_factor = streamflow.scale_factor
from hydrotools.
If the source attribute was a single scalar all along and was only returned in a list
because of some conceit of h5netcdf
, I'm inclined to just drop the index and leave it at that. Is there a good reason to continue supporting h5netcdf <= 0.13.0
?
from hydrotools.
After talking with @jarq6c offline, we came to a solution (please correct me where necessary @jarq6c). Given that h5netcdf==0.14.0
was released on 2022-02-25, we will pin the current version of nwm_client
(5.0.1
) to h5netcdf <= 0.13.0
and release the software as a post release to 5.0.1
. Subsequently, nwm_client==5.0.2
will be released and pin h5netcdf >= 0.14.0
. 5.0.2
will include a patch that resolves complies with h5netcdf >= 0.14.0
.
from hydrotools.
Related Issues (20)
- Pandas >= 2.0.0 package compliance audit HOT 4
- `nwis_client` "sqlite3.OperationalError: database is locked" HOT 6
- Move `hydrotools` namespace packages to separate repositories HOT 3
- "Run Slow Unit Tests" Action has been failing for some time HOT 2
- 3.7 Tests failing: xarray EntryPoints has no attribute get HOT 6
- DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace HOT 1
- AWS Retrospective HOT 10
- SVI Client slow unit tests failing HOT 8
- nwm_client_new documentation is incomplete for private servers. HOT 1
- nwm_client_new `get` methods fails with custom Parquet Store
- Consider supporting MS Azure (`nwm_client_new`) HOT 1
- Determine feasibility of _restclient's continued dependence on `aiohttp_cache_client` HOT 5
- SVI Client get method failing due to Pydantic>2 issue HOT 1
- New version of `_restclient` cannot be pushed to PyPI b.c. namespace packages with leading `_` in package name cannot be uploaded HOT 1
- Add some basic information about the NWM operational configuration to the `nwm_client_new` package. HOT 1
- Event Detection methods are raising `FutureWarning` HOT 3
- question about update cycle for hydrotools HOT 3
- NWPS API Available HOT 4
- `pint` caching fail leads to `FileNotFoundError` again. (`nwm_client_new`)
- Organize, and test old eHydro code (Move towards eHydro STAC) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hydrotools.