GithubHelp home page GithubHelp logo

Comments (6)

aaraney avatar aaraney commented on September 24, 2024

From the stack trace, it appears that the metadata field, scaling_factor, for the streamflow variables in one of the NWM's channel_rt output files is not being deserialized as a collection (list, etc.) and instead is just a scalar variable (int, float, etc.). This may have been caused by a downstream change to a dependency (xarray, h5netcdf).

from hydrotools.

aaraney avatar aaraney commented on September 24, 2024

I was able to resolve this issue by removing the index in the scale_factor object.

line 274 python/nwm_client/src/hydrotools/nwm_client/gcp.py

            # Extract scale factor
            scale_factor = ds['streamflow'].scale_factor[0]

            # fixed with
            scale_factor = ds['streamflow'].scale_factor

I am assuming that the metadata layout of NWM channel route link files is pretty static over time as we've not seen this issue before. I assume this is a deserialization issue propagating from, if I had to guess, xarray.

It might be best if we push a hot fix that guards and type checks the scale_factor field while we track down and figure out what is causing this and determine a long term solution.

from hydrotools.

aaraney avatar aaraney commented on September 24, 2024

Found the issue. It is propagatingh5netcdf. Today they pushed 0.14.0 which introduced the following per their change log.

Return items from 0-dim and one-element 1-dim array attributes. Return multi-element attributes as lists. Return string attributes as Python strings decoded from their respective encoding (utf-8, ascii). By Kai Mühlbauer.

I verified that rolling the version back to 0.13.0 resolved this issue.

from hydrotools.

aaraney avatar aaraney commented on September 24, 2024

Now as to how we should proceed. I know previously I said:

It might be best if we push a hot fix that guards and type checks the scale_factor field while we track down and figure out what is causing this and determine a long term solution.

In this case, I think it makes sense to just type check ds.streamflow.scale_factor and handle the case where a scalar is returned. I dont want to force others to comply with a version pinning of h5netcdf. Thoughts @jarq6c?

proposed solution

streamflow = ds['streamflow']

# h5netcdf <= 0.13.0 always deserializes numeric attributes to numpy arrays.
# even if there will only be one item in the array.
if isinstance(streamflow.scale_factor, np.ndarray):
  scale_factor = streamflow.scale_factor[0]

# h5netcdf > 0.13.0 deserializes numeric attributes to numpy arrays if there is more than scalar in the attribute.
# otherwise, a  scalar numpy value is returned
else:
  scale_factor = streamflow.scale_factor

from hydrotools.

jarq6c avatar jarq6c commented on September 24, 2024

If the source attribute was a single scalar all along and was only returned in a list because of some conceit of h5netcdf, I'm inclined to just drop the index and leave it at that. Is there a good reason to continue supporting h5netcdf <= 0.13.0?

from hydrotools.

aaraney avatar aaraney commented on September 24, 2024

After talking with @jarq6c offline, we came to a solution (please correct me where necessary @jarq6c). Given that h5netcdf==0.14.0 was released on 2022-02-25, we will pin the current version of nwm_client (5.0.1) to h5netcdf <= 0.13.0 and release the software as a post release to 5.0.1. Subsequently, nwm_client==5.0.2 will be released and pin h5netcdf >= 0.14.0. 5.0.2 will include a patch that resolves complies with h5netcdf >= 0.14.0.

from hydrotools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.