GithubHelp home page GithubHelp logo

Loosing time coordinate about kerchunk HOT 22 CLOSED

fsspec avatar fsspec commented on July 30, 2024
Loosing time coordinate

from kerchunk.

Comments (22)

martindurant avatar martindurant commented on July 30, 2024

Do you mind trying with main branch? We've been trying to nail down the thorny time types.

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

I'll give it a shot.

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

I just tested this on main on my system and got the same result as @abkfenris

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

Yep, still does the same thing on main.

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

So the date that should be 2021-08-01 is 2065-03-01, which is an offset of 15918 days.

The original timestamp of the .nc file is days since 1978-01-01 12:00:00. Coincidentally, that date is also 15918 days from 2021-08-01.

So it seems like MultiZarrToZarr is taking the day offset and applying it to the first datetime in the first data file, instead of applying it to 1978-01-01

from kerchunk.

martindurant avatar martindurant commented on July 30, 2024

@lsterzinger do you have time/interest to debug? First, I would set debug logging (e.g., fsspec.utils.setup_logging(logger_name="reference-combine")) and then second set a breakpoint in _build_output where the cftime stuff is to see how the numbers get manipulated.

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

I do have interest in debugging, but my time is a bit limited these days. I might have time this afternoon/weekend to play around with things and see what's going on.

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

Turns out fixing this is much more fun than anything else I have to do today 😉

There was a missing calendar attribute that caused the datetime building to fail. I tested it on my end and it seems to work. @abkfenris can you try out the change in #75 and see if that works for you?

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

Hmm, I still appear to be seeing that offset when trying with your branch.

image

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

Did you regenerate the .json files? I copy pasted your generation script directly and re-generated the files (I had to comment out the *_preliminary.nc files because of a 404 error). I attached a zip of my combined.json for reference.

Make sure you're actually running on the branch

import fsspec_reference_maker
print(fsspec_reference_maker.__version__)

Should result in 0.0.2+3.gcdb6528

image
combined.zip

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

Your's does open correctly.

image

I did regenerate the json after rebuilding my environment on the branch and it still gives dates in 2065.

20210819 is no longer preliminary which caused that failure, so removing the _preliminary from that URL should include it.

>>> import fsspec_reference_maker
>>> fsspec_reference_maker.__version__
'0+untagged.174.gcdb6528'

Here's the generated combined.json.zip and the Dockerfile, environment.yml, and test script: environment_and_test_script.zip

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

@abkfenris I don't think it will make a difference, but I did push another change to that branch. Can you try again?

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

I'm still getting the same result with 5072d61

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

That's super weird. I'm also on 5072d61, and I cloned your environment directly, and I get
image

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

Aha, I didn't have cftime in my environment, so it wasn't executing the code your branch changed, it was instead executing https://github.com/intake/fsspec-reference-maker/pull/75/files#diff-850b631beff65d5bd4abca60a56ef8308e345e5626bbb8a526f15d31c33a752bL192-L194 .

image

And now with cftime, comparing your branch to the released version, your branch does fix the time offset.

Awesome, thank you!

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

Great to hear!

@martindurant It seems like it's not good that it silently fails in this way if cftime is not installed, meaning this code does not parse the dates correctly
https://github.com/intake/fsspec-reference-maker/blob/5072d614cbb6cfa0f497dece422a953c7c4812ab/fsspec_reference_maker/combine.py#L205-L207

Thoughts?

from kerchunk.

martindurant avatar martindurant commented on July 30, 2024

It's a fair point, but I can't think of another way to say "see if this converts as times" (because most coordinates are not time, but it just so happens that all our examples are time series).

Note that @rabernat says we should just rely on xarray, but I haven't figured out yet how (because we have zarr arrays, not xarray datasets).

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

Is that Exception be catching both ImportError when cftime is not available and it looks like ValueError when num2pydate can't convert a date?

Maybe throw a warning in the first case, and use the existing handling otherwise?

from kerchunk.

lsterzinger avatar lsterzinger commented on July 30, 2024

from kerchunk.

martindurant avatar martindurant commented on July 30, 2024

@abkfenris , that would e OK - except it may get annoying for those that have no idea what cftime is :)

from kerchunk.

abkfenris avatar abkfenris commented on July 30, 2024

If I'm understanding warning filters right, warnings.simplefilter('ignore:::fsspec_reference_maker,default') would help quiet things down in that case.

from kerchunk.

rabernat avatar rabernat commented on July 30, 2024

Note that @rabernat says we should just rely on xarray, but I haven't figured out yet how (because we have zarr arrays, not xarray datasets).

I have added comments to try to help with this: #70 (comment)

The path we are on will end up re-implementing all of Xarray's coding machinery in fsspec-reference-maker. This is not sustainable. I would suggest refactoring and removing these special-case workarounds as soon as possible, before the technical debt piles up.

from kerchunk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.