GithubHelp home page GithubHelp logo

dblodgett-usgs / hygeo Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 1.0 13.41 MB

Home Page: https://dblodgett-usgs.github.io/hygeo/dev/

License: Creative Commons Zero v1.0 Universal

R 98.86% Dockerfile 1.14%

hygeo's People

Contributors

dblodgett-usgs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

hellkite500

hygeo's Issues

Inconsistent application of COMID keys in crosswalk

An entity with multiple COMIDs looks like:

"cat-X": {
    "COMID": ["000000", "11111111", "2222222"]
}

whereas one with only a single COMID is:

"cat-Y": {
    "COMID": "3333333"
}

In this latter case, the COMID should be a singleton list to keep a consistent schema, i.e.

"cat-Y": {
    "COMID": ["3333333"]
}

Add file I/O functions.

This is all done in the vignettes right now. Should move that code into standardized I/O functions.

Clarify distinction between flowpaths and waterbodies.

Currently, the documentation of hygeo describes waterbodies in some detail but doesn't really get into the flowpath catchment realization.

I'm now realizing that this is a major oversight that needs to be corrected.

Initially, we have a 1:1 correspondence between flowpaths and waterbodies -- we could even drop waterbodies from the mix all together if hydrologic routing attributes can be attached to flowpaths.

As we go forward, we will want to use flowpaths as a way to handle hydrologic locations along waterbodies providing the tie points between the linear catchment-realization and hydrodynamic model representations of waterbodies.

I/O points for catchment networks

This kind of functionality will be required for encapsulation of models that implement some set of pre-existing hydrofabric catchments. In this case, where a model implements a collection of hydrofabric catchments, we can refer to the model as a catchmentNetwork.

In this case, we may have I/O locations either for model output such as streamflow predictions or for model input for interbasin transfers or data assimilation.

To account for this, any model could include a list of pre-existing nexuses that it can provide output to or expects to receive input from.

As a near-term test-case, we could run an NHDPlusV2 discretization of the Sugar Creek domain as a single model. The obvious candidate for this would be WRF-Hydro-NWM. Another potential would be to use the T-Shirt model at the NHDPlusV2 discretization reporting flow to a the refactored discretization.

I open this issue to provoke discussion as much as anything. Not sure we actually want to tackle this now.

Catchment geojson contains invalid geometry

For the sugar creek release files in catchment_data.geojson, cat-66 contains an invalid geometry when read by Shapely:

       ID  area_sqkm    toID                                           geometry  valid
6  cat-66   2.205631  nex-45  MULTIPOLYGON (((-80.76158 35.21440, -80.76163 ...  False

This prevents the geometry from being used for intersection:
shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipoly gon.MultiPolygon object at 0x11774f550>

Make edge list an edge map

The *_edge_list.json files generated are lists of edge mappings:

[
  {
    "id": "cat-27",
    "toid": "nex-26"
  },
  {
    "id": "cat-52",
    "toid": "nex-34"
  },
  {
    "id": "cat-67",
    "toid": "nex-68"
  }
]

This is really just a map, and would likely be better suited for look ups if structured as one:

{
    "cat-27": "nex-26",
    "cat-52": "nex-34",
    "cat-67": "nex-68"
}

refactor crosswalk local_id

Somewhat related to #15, and maybe a way to address some of the concerns raised there, should the crosswalk simply key by local_id instead of the local_id being embedded?

Currently the crosswalk is a list of objects like
[{"local_id":"cat-1","COMID":"9731278"},...]

After working with this in a couple places, I think it makes sense to factor out the common local_id and simply key by that. Then even if we end up in a 1:many scenario, or different crosswalk definitions for waterbodies than catchments, there is still a unique mapping. Propose changing crosswalk to something like

{
    "cat-1": {
            "COMID":"9731278",
            "site_no":"123456789",
            "other_ref":"point to something"
    },
    "cat-2": {
            "COMID":"00000000",
            "site_no":"987654321",
            "other_ref":"point to something"
    }
}

Add topology to geometry tables?

Currently, the geometry tables don't include "toID" information.

There's been a request to make those tables serially complete so we don't need a stand alone edge list.

Solution is to add the dendritic connections in a toID property of the GeoJSON.

More crosswalk updates

This release updates the crosswalk file to container structured linkages to catchments (i.e. cat-X: {...}

https://github.com/dblodgett-usgs/hygeo/releases/tag/v0.5.3

It would be useful to have similar cross walk references for flowpaths, i.e.

    `fp-1`:
    {
        "COMID": ["0000000", "00000001"],
        "outlet_comid": 0000000
    }

Since flowpaths are a realization of the catchment, I imagine it is also possible to connect the flowpath the NHD segments it coincides with much like the catchment, and this might also be useful.

This definitely repeats information since a flowpath is keyed to a realized_catchment, but having these explicitly in the crosswalk provides a convenient and semantically consistent way to reference catchment realizations to NHD features, especially in cases where a flowpath exists but an area realization doesn't.

Add cat-* id to crosswalk

Consider adding catchment id to the crosswalk file to explicitly link catchments to "reaches" via COMID.

Get a nexus cross walk for a set of hydrologic locations.

Given an hygeo waterbody network and a set of related hydrologic locations, we need to be able to generate a cross walk.

The use cases here is where we have a set of hydrologic locations that are at the outlet of each NHDPlusHR flowline. We need to know where they land along the waterbody network of an hygeo object.

This should be implemented as a function that takes the waterbody data of an hygeo object and a set of locations. The locations should have a known main-id that is the same identifier space as the main-id of the hygeo waterbodies.

The response should include the hygeo waterbody id that each input node should be associated to.

This function may also include NHDPlus data attributes to allow a tie back to reachcode/measure linear referencing. Potential to mock reachcode/measure attributes such that the nhdplusTools get_flowline_index() function can be used.

Rename waterbody_edge_list.json

Since implementing flowpath_data.geojson, waterbody_edge_list.json should semantically be called flowpath_edge_list.json.

Document how parameter sets will hang-off hygeo objects

Need a JSON structure that keys off catchment identifier.

The core schema of the objects should include a model formulation type.

Other details of the object are dictated by the formulation type.

How this relates to a catchment network is an open question. Most likely, it comes down to specifying inflow and outflow locations to the catchment. These are ostensibly contracted nodes.

There is a two-tier system here -- backbone contracted nodes and model i/o points that may or may not be coincident with contracted nodes would be represented as hydrologic locations along mainstems.

GeoJSON output Identity management

GeoJSON specificies a Feature as having an optional id. In the GeoJSON outputs, we have an ID that is a Property of the feature.

This makes using a GeoJSON parser a little strange when trying to assign the identity of the Feature of the object being deserialized to.

Does this ID need to be a property for some reason, or can we bump that up a level?

Incorrect mapping of gage site_no when a COMID is split

image

Looking at this small subset of the coarse data you can see an NWIS site that appears to be at the outlet of cat-67. In the crosswalk, however you see

  "cat-67": {
    "COMID": "9731286.1",
    "outlet_COMID": 9731286
  }

This shows that the comid 9731286 was split, and indeed cat-68 shows the rest of that comid:

  "cat-68": {
    "COMID": "9731286.2",
    "site_no": "02146562",
    "outlet_COMID": 9731286
  }

I think this is an artifact of splitting at the gage to create the upstream catchment, but then the gage isn't logically mapped to that catchment.

Support independent graph of waterbodies and catchments.

This may not be strictly required for the Sugar Creek Basin. If not, will defer this for later, but should plan it architecturally.

Nexuses should support interfaces between catchment areas and waterbodies, just catchment areas, or just waterbodies. Noting that since all the junctions are modeled as nexuses, there are a lot of catchments that are only realized as flowpaths with a catchment-aggregate area.

Near term, 1:many catchment area to waterbody will be needed. Longer term, 1:many waterbody to catchment area may be needed.

If we keep get_nexus() based on flowlines (read: waterbodies) and allow catchment area to be 1:many with flowlines, the current implementation basically works. The main change is that get_catchment_edges() would take catchments that interact with a subset of the waterbody nexuses.

For 1:many waterbody to catchment area, there are nexuses between the catchments that don't break up the waterbody. In that case, the nexus hydrologic location is a landmark along the waterbody feature. Handling hydrologic locations along waterbodies is for future work and will not be handled in this task.

In implementation, this will all rely on input data representing all the possible nexuses when calling get_nexus() and the get_catchment() and get_waterbodies() only specifying the nexuses that break up the waterbody network.

The outcome of this task should be a basic implementation that allows 1:many catchment area to waterbody.

v0.5.0 release artifacts contain build path and html

The zip files in the release artifacts for v0.5.0 contain the build path in the zip, i.e. docs/build/default/*.json as well as an index.html. These are intended to be flat zips containing the just data *.json and *.geojson.

GeoJSON output keys in lowercase

Is there a specific reason upper case keys are used in the properties, i.e. ID? Could these be made lowercase. It isn't a huge deal, but a consistent key case makes parsing easier.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.