dblodgett-usgs / hygeo Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://dblodgett-usgs.github.io/hygeo/dev/
License: Creative Commons Zero v1.0 Universal
Home Page: https://dblodgett-usgs.github.io/hygeo/dev/
License: Creative Commons Zero v1.0 Universal
One at NHDPlusV2, one at reference resolution, one heavily aggregated.
An entity with multiple COMIDs looks like:
"cat-X": {
"COMID": ["000000", "11111111", "2222222"]
}
whereas one with only a single COMID is:
"cat-Y": {
"COMID": "3333333"
}
In this latter case, the COMID should be a singleton list to keep a consistent schema, i.e.
"cat-Y": {
"COMID": ["3333333"]
}
This is all done in the vignettes right now. Should move that code into standardized I/O functions.
The crosswalk.json existed in v0.3.0 but is no longer in the release artifact.
Currently https://github.com/dblodgett-usgs/hygeo/blob/master/R/functions.R#L20 gets the toNode.
The least invasive would be to create a new function that can create nodes based on an edge list.
create_nexuses(fline, nexus_prefix = "...")
would return nexuses the same as get_nexuses()
such that it could just be called by get_nexuses()
Currently, the documentation of hygeo describes waterbodies in some detail but doesn't really get into the flowpath catchment realization.
I'm now realizing that this is a major oversight that needs to be corrected.
Initially, we have a 1:1 correspondence between flowpaths and waterbodies -- we could even drop waterbodies from the mix all together if hydrologic routing attributes can be attached to flowpaths.
As we go forward, we will want to use flowpaths as a way to handle hydrologic locations along waterbodies providing the tie points between the linear catchment-realization and hydrodynamic model representations of waterbodies.
This kind of functionality will be required for encapsulation of models that implement some set of pre-existing hydrofabric catchments. In this case, where a model implements a collection of hydrofabric catchments, we can refer to the model as a catchmentNetwork.
In this case, we may have I/O locations either for model output such as streamflow predictions or for model input for interbasin transfers or data assimilation.
To account for this, any model could include a list of pre-existing nexuses that it can provide output to or expects to receive input from.
As a near-term test-case, we could run an NHDPlusV2 discretization of the Sugar Creek domain as a single model. The obvious candidate for this would be WRF-Hydro-NWM. Another potential would be to use the T-Shirt model at the NHDPlusV2 discretization reporting flow to a the refactored discretization.
I open this issue to provoke discussion as much as anything. Not sure we actually want to tackle this now.
For the sugar creek release files in catchment_data.geojson, cat-66 contains an invalid geometry when read by Shapely:
ID area_sqkm toID geometry valid
6 cat-66 2.205631 nex-45 MULTIPOLYGON (((-80.76158 35.21440, -80.76163 ... False
This prevents the geometry from being used for intersection:
shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipoly gon.MultiPolygon object at 0x11774f550>
The geojson toIDs are NA and the edge list has some funny ids in them.
The *_edge_list.json
files generated are lists of edge mappings:
[
{
"id": "cat-27",
"toid": "nex-26"
},
{
"id": "cat-52",
"toid": "nex-34"
},
{
"id": "cat-67",
"toid": "nex-68"
}
]
This is really just a map, and would likely be better suited for look ups if structured as one:
{
"cat-27": "nex-26",
"cat-52": "nex-34",
"cat-67": "nex-68"
}
Somewhat related to #15, and maybe a way to address some of the concerns raised there, should the crosswalk simply key by local_id
instead of the local_id
being embedded?
Currently the crosswalk is a list of objects like
[{"local_id":"cat-1","COMID":"9731278"},...]
After working with this in a couple places, I think it makes sense to factor out the common local_id
and simply key by that. Then even if we end up in a 1:many scenario, or different crosswalk definitions for waterbodies than catchments, there is still a unique mapping. Propose changing crosswalk to something like
{
"cat-1": {
"COMID":"9731278",
"site_no":"123456789",
"other_ref":"point to something"
},
"cat-2": {
"COMID":"00000000",
"site_no":"987654321",
"other_ref":"point to something"
}
}
Currently, the geometry tables don't include "toID" information.
There's been a request to make those tables serially complete so we don't need a stand alone edge list.
Solution is to add the dendritic connections in a toID property of the GeoJSON.
This release updates the crosswalk file to container structured linkages to catchments (i.e. cat-X: {...}
https://github.com/dblodgett-usgs/hygeo/releases/tag/v0.5.3
It would be useful to have similar cross walk references for flowpaths, i.e.
`fp-1`:
{
"COMID": ["0000000", "00000001"],
"outlet_comid": 0000000
}
Since flowpaths are a realization of the catchment, I imagine it is also possible to connect the flowpath the NHD segments it coincides with much like the catchment, and this might also be useful.
This definitely repeats information since a flowpath is keyed to a realized_catchment
, but having these explicitly in the crosswalk provides a convenient and semantically consistent way to reference catchment realizations
to NHD features, especially in cases where a flowpath exists but an area realization doesn't.
The big issue is the catchment
is currently a container for the catchmentBoundary geometry.
This issue can be closed when there is a simple write up of the hygeo list class and it is reasonably in line with ngen.
Consider adding catchment id to the crosswalk file to explicitly link catchments to "reaches" via COMID.
Given an hygeo waterbody network and a set of related hydrologic locations, we need to be able to generate a cross walk.
The use cases here is where we have a set of hydrologic locations that are at the outlet of each NHDPlusHR flowline. We need to know where they land along the waterbody network of an hygeo object.
This should be implemented as a function that takes the waterbody data of an hygeo object and a set of locations. The locations should have a known main-id that is the same identifier space as the main-id of the hygeo waterbodies.
The response should include the hygeo waterbody id that each input node should be associated to.
This function may also include NHDPlus data attributes to allow a tie back to reachcode/measure linear referencing. Potential to mock reachcode/measure attributes such that the nhdplusTools get_flowline_index() function can be used.
Currently, NWIS sites are just used to avoid certain catchments. Should be more selective about how gages relate to the network and split catchments with a gage out in the middle.
Will be based on a small but not trivial watershed.
nhd <- nhdplusTools::plot_nhdplus("02146800",
gpkg = src_gpkg,
overwrite = FALSE,
nhdplus_data = src_gpkg,
actually_plot = FALSE)
Since implementing flowpath_data.geojson
, waterbody_edge_list.json
should semantically be called flowpath_edge_list.json
.
Need a JSON structure that keys off catchment identifier.
The core schema of the objects should include a model formulation type.
Other details of the object are dictated by the formulation type.
How this relates to a catchment network is an open question. Most likely, it comes down to specifying inflow and outflow locations to the catchment. These are ostensibly contracted nodes.
There is a two-tier system here -- backbone contracted nodes and model i/o points that may or may not be coincident with contracted nodes would be represented as hydrologic locations along mainstems.
GeoJSON specificies a Feature
as having an optional id
. In the GeoJSON outputs, we have an ID
that is a Property
of the feature.
This makes using a GeoJSON parser a little strange when trying to assign the identity of the Feature
of the object being deserialized to.
Does this ID
need to be a property for some reason, or can we bump that up a level?
This will largely be outside the vignette but setting up a convention to propagate important hydrologic locations of type outlet through processing as nexuses may require some special handling in functions or identifier matching. Result should be a lookup table of local nexus IDs and provided hydrologic location ids.
Looking at this small subset of the coarse data you can see an NWIS site that appears to be at the outlet of cat-67
. In the crosswalk, however you see
"cat-67": {
"COMID": "9731286.1",
"outlet_COMID": 9731286
}
This shows that the comid 9731286
was split, and indeed cat-68
shows the rest of that comid:
"cat-68": {
"COMID": "9731286.2",
"site_no": "02146562",
"outlet_COMID": 9731286
}
I think this is an artifact of splitting at the gage to create the upstream catchment, but then the gage isn't logically mapped to that catchment.
This may not be strictly required for the Sugar Creek Basin. If not, will defer this for later, but should plan it architecturally.
Nexuses should support interfaces between catchment areas and waterbodies, just catchment areas, or just waterbodies. Noting that since all the junctions are modeled as nexuses, there are a lot of catchments that are only realized as flowpaths with a catchment-aggregate area.
Near term, 1:many catchment area to waterbody will be needed. Longer term, 1:many waterbody to catchment area may be needed.
If we keep get_nexus()
based on flowlines (read: waterbodies) and allow catchment area to be 1:many with flowlines, the current implementation basically works. The main change is that get_catchment_edges()
would take catchments that interact with a subset of the waterbody nexuses.
For 1:many waterbody to catchment area, there are nexuses between the catchments that don't break up the waterbody. In that case, the nexus hydrologic location is a landmark along the waterbody feature. Handling hydrologic locations along waterbodies is for future work and will not be handled in this task.
In implementation, this will all rely on input data representing all the possible nexuses when calling get_nexus()
and the get_catchment()
and get_waterbodies()
only specifying the nexuses that break up the waterbody network.
The outcome of this task should be a basic implementation that allows 1:many catchment area to waterbody.
site_no got dropped out of the crosswalk at some point. Need to add it back.
Should include COMID, NWIS Site, any other identifiers provided to function.
The zip files in the release artifacts for v0.5.0 contain the build path in the zip, i.e. docs/build/default/*.json
as well as an index.html
. These are intended to be flat zips containing the just data *.json
and *.geojson
.
Is there a specific reason upper case keys are used in the properties, i.e. ID
? Could these be made lowercase. It isn't a huge deal, but a consistent key case makes parsing easier.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.