respec / hspsquared Goto Github PK
View Code? Open in Web Editor NEWHydrologic Simulation Program Python (HSPsquared)
License: GNU Affero General Public License v3.0
Hydrologic Simulation Program Python (HSPsquared)
License: GNU Affero General Public License v3.0
Wondering if there is code in this suite for writing WDM files? I still use wdimex written in fortran, but if there is a modern implementation in this package I may adopt.
Hi,
I have a UCI file that runs at 15min timestep. A couple issues have come up running the import checks, but I'm still unable to reproduce HSPF results.
Issues:
external sources evap record is 15-min with a 0.75 multiplier in HSPF. The MFACTOR comes into HSP2 ok, but needs to be divided by 96 (=4*24). After revising this MFACTOR the IMPLND and portions of the PERLND are ok.
The PERLND in HSP2 are not reproducing HSPF results. I've traced the source to the groundwater storage bin (AGWS/AGWO). Everything above that is nearly identical. The AGWI is nearly identical for HSP2 vs HSPF.
However, the AGWO, AGWS, and GWVS all vary between HSP2 and HSPF. I checked all the PERLND parameters and states and they are the same. Evap is off on the groundwater bin (AGWET=BASET=0).
Trying to figure out what to try next. KGW appears to be 0 (AGWRC=0.997, DELT60=0.25 for 15min), and perhaps HSP2 is treating this case differently than HSPF? Is something else different about the groundwater routine? Perhaps the 15-min time step may be an issue here as it was with item 1.
Any ideas are appreciated.
Thanks.
As we were working on the testing system (#31), @rheaphy, @steveskrip, @ptomasula and I discussed on a call the challenges of git-tracking Jupyter notebooks, because output cells are updated every time they are run even if the Python code or markdown doesn't change.
Let's work out a good way to save and track Jupyter notebooks.
Here's @ptomasula's HSP2 Potential Solutions for Python Notebooks in GitHub email with background and options:
I did a bit of digging into a potential solution to dealing with merging python notebooks in git. I found three potential solutions, which are outline below, but I’m personally leaning towards option 2. Option 2 is extremely easy to implement, solves the immediate merging challenges Bob was describing, and while there is a slight potential for issues around resolving merge conflicts between binary files, those can be largely avoided by coordinating our efforts (which we’ve done very well thus far). I’d be curious to hear thoughts from the rest of the group. I’d also note that whichever option we choose, isn’t necessarily set in stone. If we find the approach isn’t working for us we can change it down the road. We can also deviate from these options if anyone has great suggestion for a solution.
Background/Problem
Python notebooks are stored as JSON, which provides for source tracking. However; when a notebook cell is run, certain cell attributes (output, execution count, etc.) are updated. This caused a number of impacts including;
- Added difficultly managing merge conflicts (manual line by line process)
- Larger and somewhat unruly commits
- More difficult to review. Important changes in code vs less critical (trivial for the purposes of source tracking?) changes resulting from running a cell both appear in a diff.
Option 1 - Strip output block out prior to commit
Python notebooks are stored as JSON. This makes it fairly easy to read and programmatically strip out the pieces of the document that are causing the issues described above. We could either write or use an existing script to accomplish this.
- Pros
- Fairly easy to implement. There already appears to be a number of tools that solve do this (https://pypi.org/project/nbstripout/ or https://github.com/toobaz/ipynb_output_filter). It would also not be a big lift to develop something to do this if we need to.
- Still allows us to source track code changes in the cells contents (‘source’ attribute).
- Cons
- We lose the ability to share the output directly in the notebook, which may be of some value to users.
- Extra step to committing code. Need to runs conversion tool prior to commit. (We might be able to get around this using the gitattribute filter, but I haven’t an experience using that)
Option 2 - Enable notebooks to be handled as binary
Utilized the git attributes for flag all or select notebook files as binary. This would overwrite the entire file upon commit, and reduce conflict resolution to which version to use (instead of manually resolving lines)
- Pros
- Easy to implement
- Allows for the output from the notebooks to be shared
- Cons
- Slight potential to overwrite code changes because of how merge conflicts are handled between binary files (i.e. must use either my file our their file)
- Less explicit tracking of code changes, but could still tease them out by comparing versions
Option 3 – Use a merge management tool
Use a tool to ease the merge process. The most promising one I've came across is nbdime (https://github.com/jupyter/nbdime), which was developed by the jupyter team.
- Pros
- Potential for best of all both world approach, easier to manage conflicts while still retaining change tracking on a line by line basis
- Cons
- Appears to be console only (at least nbdime does, but there may be other tools out there)
- Doesn't solve unruly commits when looking at them on GitHub
- Still requires some level of manual conflict resolution (it’s just drastically reduced)
Hello !
I've just joined this website. And I found your organization.
I am a Ph.D. candidate for Hydrologic & water quality modeler under the climate change scenario.
For my studies I want some uncertainty analysis in terms of various variables.
However, from some conference material I found that your organization developed some modules to link HSPF input and output file to DAKOTA program...
Do you support those program or module?
Sincerely yours,
Jiheon Lee.
Presently the HSP2 repo contains a wide variety of tutorials and demos as Jupyter Notebooks, which have been developed at different time periods and different versions of HSP2, and which have overlapping content that does not clearly build from one tutorial to the next.
Our objective is to consolidate and update all Jupyter notebooks, to build a clear progression of hands-on tutorials that demonstrate increasing complexity of running and using HSP2.
This is not regarding an issue in code but a question while trying to understand code.
IVOL accounts for sum of inflows in RCHRES from PLS and impervious surfaces. I suppose this is calculated from PERO and SURO as mentioned in MASS-LINK block of uci. I am trying to understand that how this code calculates IVOL. In file hrchhyd.py, IVOL array is set to zeros for the case when there is no inflow to RCHRES. But for cases when there is inflow from either PERLND or IMPLND in RCHRES, which part of code of HSP2 calculates IVOL.
I shall be thankful for the hint.
@rheaphy emailed with his 2020-05-24 "HSP2 status update":
Most of the last 2 weeks was investigating the much slower run times of HSP2 compared with HSPF. Prior to the "new" HSP2, the old HSP2 was 1.6 times slower than HSPF. I had expected this difference to be much less with the new design. Instead it started out almost 4 times slower! Since Python, Pandas, Numpy and Numba had all changed significantly, it is very hard to understand where the slow down occurred. With yesterday's update, I had cut this to a bit above 2 times slower (depending on if the HDF5 file was already created or not.) Using faster write methods in HDF5 seemed to really speed things up - but caused warning messages. I never found any problem in either the HDF5 file or the run results when the warnings were visible. Since warning messages would bother our users, I rejected using the faster write methods to improve the run time. (I still keep the option of the faster write methods but disabling the warning messages as a last resort.) I believe the only difference between the fast and slow writing methods is if they flush() after every write or not.)
Basically, I started using BLOSC compression on all timeseries and computed results when storing into the HDF5 file. This cut the HDF5 file size almost in half as well. Since the newer HDF5 library keeps the HDF5 file from growing significantly with additional runs, this is great. (The old HDF5 library would really let the HDF5 file grow to huge sizes!) And no warnings. I did not compress the UCI like tables so that other tools like HDFView would display properly. While I could compress everything and still use HDFVIew if I register the compression library to the HDF library, I don't want to make our users do this themselves. So this is a compromise for now.
I suspect that the changes to Numba are primarily responsible since I now need to copy information from the regular Python dictionary containing UCI information to the Numba typed dictionary in order to transfer the data into the hydrology codes. I spent time reviewing the Numba team meeting notes and issues and found a related issue concerning the new Numba typed lists. The contributors to the discussion indicated this could impact the typed dictionary as well. The Numba team is investigating the issue, so I will wait for more information before I address this improvement direction. I will do other profiling tests to look for other possible places for the slow execution.
I still think I can make HSP2 nearly as fast as HSPF, but it will take more time. At least, it is still fast enough to use - again. I remember the early days when calleg took over 40 minutes to run instead of a little over a minute. (HSPF takes 32.2 seconds on my machine and worst case HSP2 runs now take 1 minute 23 seconds (clean start after restarting the Python kernel and creating a new HDF5 file) to 1 minute 19 seconds if kernel had previously run HSP2. Without Numba, HSP2 takes 13 minutes 25 seconds - so Numba does help a lot!
I see a lot of profiling in my future.
These some of these recent commits are cca2b0c, d154e55, and e92c035.
This high-level issue pulls together several past, current, and near-future efforts (and more granular issues).
The tight coupling of model input/output (I/O) with the Hierarchical Data Format v5 (HDF5) during the HSP2 runtime limits both performance (see #36) and also interoperability with other data storage formats such as the cloud-optimized Parquet and Zarr storage formats (see Pangeo's Data in the Cloud article) that are tightly coupled with high-performance data structures from foundations PyData libraries Pandas, Dask DataFrames, and Xarray.
Abstracting I/O using a class-based approach would also unlock capabilities for within-tilmestep coupling of HSP2 with other models. Specifically, HSP2 could provide upstream, time-varying boundary conditions for higher-resolutions models of reaches, reservoirs, and the groundwater-surface water interface.
Our overall plan was first outlined and discussed in LimnoTech#27 (Refactor I/O to rely on DataFrames & provide storage options). In brief, we would refactor to:
During our workshop we noticed that instances of the HDF5 class can leave connections open to the HDF5, file that persist even when the del method is called on the instance. This behavior was only observed in an IPython environment (e.g. jupyter lab). With some additional testing, I determined that this occurs when the instance of that object was called as the last line in the cell. This typically does some display feature in IPython, but for this particular class is also appear to cause the instance to somehow become referenced by the cell. The results is that the open _store
attribute is keeping the connection to HDF5 even when the instance has been deleted.
When converting timeseries data from wdm files to hdf5 files the constituent attribute for the observed data is not converted. For example, constituent for TIMESERIES/TS122 should be 'ATMP' based on the attribute in WDM DSN 122.
If a file is not found, show the path that was used in addition to the filename
current code from main.py (11-13)
if not os.path.exists(hdfname):
print (hdfname + ' HDF5 File Not Found, QUITTING')
return
With PR #65 & #75, @timcera introduced some nice Command Line Interface (CLI) features to HSP2. To do so, he used the Mando library.
Unfortunately, mando hasn't been updated since 2017 or tested on Python 3.9, and we'll be wanting to migrate to Python 3.9 in the next round of work. We'll likely want to select and implement an alternative Python CLI library.
Here are few posts that I found on the topic:
I would like to submit pull requests for simple stuff like fixing deprecation warnings in the tutorials. To do that I need to keep my fork synced with the main repo, and to do THAT I need to be able to read from the repo (see https://help.github.com/articles/syncing-a-fork/). Looks like clone/read rights are locked down at the moment.
Are you guys accepting updates/changes/fixes right now?
file 'logfile.txt' does not have line separators. suggest adding '\n' to separate records.
As we discussed for our RESPEC-LimnoTech Collaborative Work Plan during our workshop (March 24-25, 2020), expanding and automating the testing of HSP2 vs. HSPF is an immediate priority.
Our objective for testing is to ensure that HSP2 provides the same results as HSPF for:
We decided that:
RESPEC has two test models to contribute:
LimnoTech will add additional models:
Let's use this issue to track progress on all the smaller tasks required to complete this.
We have already added some reference models and testing code with 49c71f3, 60378a7, and LimnoTech@130bef2.
cc: @rheaphy, @PaulDudaRESPEC, @steveskrip, @ptomasula,
The GENER module is used extensively to compute (within the model) constituent loads based on input flow and concentration, with one or both being a time series.
These capabilities are used extensively several of our HSP2 use cases
@ptomasula is getting started on this.
When running Test10 (slightly modified to export output data to an HBN file), I can successfully create an .h5 file using readUCI and readWDM. Then, when running main, I get the following error messaging when processing the PSTEMP module
--------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-4-ff67a3b4c34f> in <module>
----> 1 main(hdfname, saveall=True)
c:\Users\sskripnik\Documents\GitHub\HSPsquared\HSP2\main.py in main(hdfname, saveall)
71
72 ############ calls activity function like snow() ##############
---> 73 errors, errmessages = function(store, siminfo, ui, ts)
74 ###############################################################
75
c:\Users\sskripnik\Documents\GitHub\HSPsquared\HSP2\PSTEMP.py in pstemp(store, siminfo, uci, ts)
30
31 ui = make_numba_dict(uci)
---> 32 TSOPFG = ui['TSOPFG']
33 AIRTFG = int(ui['AIRTFG'])
34
~\Anaconda3\envs\hsp2_py37\lib\site-packages\numba\typed\typeddict.py in __getitem__(self, key)
146 raise KeyError(key)
147 else:
--> 148 return _getitem(self, key)
149
150 def __setitem__(self, key, value):
~\Anaconda3\envs\hsp2_py37\lib\site-packages\numba\dictobject.py in impl()
736 ix, val = _dict_lookup(d, castedkey, hash(castedkey))
737 if ix == DKIX.EMPTY:
--> 738 raise KeyError()
739 elif ix < DKIX.EMPTY:
740 raise AssertionError("internal dict error during lookup")
KeyError:
Files and jupyter notebook are located here: https://github.com/LimnoTech/HSPsquared/tree/develop-WaterQuality/tests/test10b/HSP2results
The Test10 uci and WDM files are in DataSources folder. They should be in the TutorialData folder.
Once I move these files to right places, I get following warnings and errors.
C:\Dev\HSPsquared\HSP2tools\uciReader.py:447: FutureWarning: convert_objects is deprecated. To re-infer data dtypes for object columns, use DataFrame.infer_objects()
For all other conversions use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
df = df.convert_objects(convert_numeric=True)
C:\Users\Anurag.Mishra\AppData\Local\Continuum\Anaconda3\envs\Python2\lib\site-packages\IPython\core\interactiveshell.py:2717: FutureWarning: get_store is deprecated and be removed in a future version
HDFStore(path, **kwargs) is the replacement
interactivity=interactivity, compiler=compiler, result=result)
uciReader is Done
Processing WDM file TutorialData/TEST.WDM
AttributeError Traceback (most recent call last)
<ipython-input-4-75fb1fef8ffe> in <module>()
1 HSP2tools.makeH5()
2 HSP2tools.readUCI(uciname, unpackedhdfname)
----> 3 HSP2tools.ReadWDM(wdmname, unpackedhdfname)
4 get_ipython().system(u'ptrepack {unpackedhdfname} TutorialData\\tutorial.h5')
C:\Dev\HSPsquared\HSP2tools\wdmReader.pyc in ReadWDM(wdmfile, hdffile, **options)
43 m = re.search(pat, row.SVOLNO)
44 key = int(m.group(2))
---> 45 if not WDM.exists_dsn(wdmname, key):
46 continue
47
AttributeError: 'WDM' object has no attribute 'exists_dsn'
I am attempting to import a UCI from an HSPF model and run it using HSPP as described in the Preview notebook. The UCI and WDM files import without error. However, when I try to run the model, I get the following error:
KeyError: 'No object named /RESULTS/RCHRES_R500/ROFLOW in the file'
Looking back to the UCI, the only pace where ROFLOW is referenced is in the Mass-Link Block:
RCHRES ROFLOW ROVOL 1 RCHRES INFLOW IVOL 1
My understanding is that this line is to instruct HSPF how to handle the passing of time series from on RCHRES to another, not to specify any output. Since they are the same module, they use the same units, and there are no conversions necessary. The actual linkages are setup in the SCHEMATIC Block.
I wanted to document my testing of the new pip-based install (alternative to Anaconda) developed by @timcera . Install seems fine, but testing is now a challenge. I would love any pointers that can be given specifically on managing my expectations about what I should be seeing.
git clone https://github.com/timcera/HSPsquared/tree/develop
apt
packages for hdf5-tools, and there are existing executables named h5cc in the system prior to unpacking and installing this package, however, this did not seem sufficient to run test10, so I added these.wget
)tar -xvf hdf5-1.12.1.tar.gz
cd hdf5-1.12.1
make all
sudo make install
sudo apt install hdf5
sudo apt-get install pip
pip install panda
pip install tables==3.6.1
pip install numba
pip install .
hsp2
executable (see Testing below)/opt/model/HSPsquared/HSP2notebooks/Data
, though the run did not appear to complete to completion.
mkdir test; cd test
cp /HSPsquared/HSP2notebooks/Data/* ./
hsp2 import_uci test10.uci test10.h5
hsp2 run test10.h5
in ViewRCHRES the statement with
pd.get_store(hdfname, mode='r') as store:'
results in the following message
'sys:1: FutureWarning: get_store is deprecated and be removed in a future version
HDFStore(path, **kwargs) is the replacement'
also, remove 'mode='r' in argument list as HDF5 file may already be open
What is the Working environment for HSPsquared?
Is it only workable in Python 2.7 condition?
Actually, I have installed Python 3.7.3 and when applying HSPsqured code as YouTube material presented, It shows error condition continuously.
Could you help me out what the problem is?
Attached is a pic of screenshot :)
Thanks.
HSP2 is presently functionally identical to HSPF, with identical names for modules, sections, routines, and variables, and with identical water simulation algorithms. This provides both credibility and also access to the excellent HSPF documentation.
A goal for HSP2 is coupling with other models, which would be facilitated by adding detailed documentation strings to the code base, including adding long name aliases for the current <8 character Fortran names. In addition, HSP2 can be run from a variety of new interfaces not previously available to HSPF users.
The long term goal is to develop a complete suite of HSP2 manuals and tutorials through automated document generation using Sphinx or a similar document generation library and through tutorials and example code implemented in interactive Jupyter Notebooks.
This task is a first step toward those documentation goals, by copying long names and descriptions from the HSPF manual into HSP2 doc strings.
The Preview Jupyter notebook throws an error on Cell 15:
df = HSP2.readPLTGEN(pltname)
df.head()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-15-1f9059be498a> in <module>()
----> 1 df = HSP2.readPLTGEN(pltname)
2 df.head()
AttributeError: 'module' object has no attribute 'readPLTGEN'
Hello !
Currently, I am trying to apply your code with my dataset.
So for now, I am trying to converting wdm file to HDF5 format.
After I applying my own dataset (as a test I used BASINS4 sample sediment dataset which is mostly larger dataset(56timeseries) than your test01 code(19timeseries)) with your ReadWDM module, I have got an error as below picture shows.
Point-1
Point-2
If you know why it is coming up. Please let me know.
Thank you for developing this package and reading this question.
Jiheon Lee.
Maybe this is only relevant to the branch that was created by @timcera to allow Linux command line execution, but also appears related to #53 .
Summary: WDM files described in the FILES block and EXT SOURCES/TARGETS with a numerical suffix are not added to the hdf5 file. Example:
FILES
<FILE> <UN#>***<----FILE NAME------------------------------------------------->
WDM1 21 ../../../input/scenario/climate/met/nldas1121/met_A51800.wdm
WDM2 22 ../../../input/scenario/climate/prad/p20211221/prad_A51800.wdm
WDM4 24 forA51800.wdm
MESSU 25 forA51800.ech
26 forA51800.out
END FILES
My sense is that this may simply be a convention, not an actual part of the definition that HSPF expects, but nevertheless it appears to be somewhat common (I have seen it in the CBP files, as well as TMDL uci's from 2 different modeling groups).
The error arises in the code in HSP2_CLI.py which skips these elements, as the string test expects an exact match on "WDM", not "WDMn":
if nline[:10].strip() == "WDM":
This ends up skipping the WDM in question, then, later when running the model, it fails because it can not find the DSN (error: ).
By isolating only the first 3 characters of the stripped string, the test for
By changing the conditional tothe below allows things to proceed (and my CBP UCI executes and appears to produce reasonable output, more on that later):
if (nline[:10].strip())[:3] == "WDM":
Pull request submitted in #81
Presently, HSP2 can not handle irregular time series as inputs.
Although irregular time series inputs are not common for HSPF, @bcous has found a historical set of WDM files where the input time series started at 1 hour intervals and then switched to 15 minute intervals. HSPF resamples all inputs to the model time step immediately prior to a run, so everything works fine in HSPF.
With the recent successful Rewrite readWDM.py to read by data group & block #21, we can properly read these and all other tested WDM. However, as @ptomasula commented (LimnoTech#21 (comment)), running HSP2 on those inputs will throw an error if tsfreq == None
:
~/Documents/Python/limno.HSPsquared/HSP2/utilities.py in transform(ts, name, how, siminfo)
78 pass
79 elif tsfreq == None: # Sparse time base, frequency not defined
---> 80 ts = ts.reindex(siminfo['tbase']).ffill().bfill()
81 elif how == 'SAME':
82 ts = ts.resample(freq).ffill() # tsfreq >= freq assumed, or bad user choice
KeyError: 'tbase'
We have several options to fix this:
HSP2.utilties.py
code to handle itWe'll do option 1 in the short term (probably "manually"), but option 4 is probably the best long term fix.
This issue will track progress on option 4.
Thanks for accepting the PR!
The next idea I have is big enough to warrant discussion before I implement something. What I would like to do is remove all of the dependent libraries, '/hspfbintoolbox*', '/mando*', '/tstoolbox*', and '/wdmtoolbox*' and list them in the 'install_requires' variable in the 'setup.py'.
This isn't a have to, I just think it is a bit cleaner approach. You do have to rely that I don't break something - but can still pin to an earlier tested version if needed.
Kindest regards,
Tim
Hello !~
Several days ago, I raise the question related to HDF5 file converter issue from WDM file.
And PaulDuda helped me out to tackle the issue.
But I have one more question related to that issue especially on the ReadWDM file code.
In the ReadWDM file, you are using the number 512 in the code line 37. ( I have attached the related pic with red line marker below)
It seems like that you are using 512, when you are identifying the times series starting index. For instance, first times series starts from index no. = 512, and second one starts from 512*n and so on. But here is the question.
*** Where does the number 512 come from?
It seems like that the no. comes from the WDM DB/ Table specification which defines the specifications including ID, DSN, and etc. for WDM DB variables. If that is the case Could you send me one of this or could you let me know the source of it?
Sorry for bothering you again. Even though, I have resolved the issue this time, I would like to understand what the code lines say exactly for further application.
I appreciate your help.
Thank you for reading the comments again !~
Best regards,
Jiheon Lee
During the RESPEC-LimnoTech workshop to kickoff HSP2 collaboration (March 24-25, 2020), we decided to use a Feature Branch workflow similar to:
Our related decisions can be summarized as:
master
branch will considered the official public facing release branch
develop
branch is the key development branch for all new featuresdevelop
We set up some of this workflow with #29 and associated branch creation.
We also decided to use create GitHub Issues in RESPEC's repo for tracking shared objectives and tasks, to main an archive of progress & solutions. LimnoTech will use their issue tracker only for smaller granular task tracking that don't add value to cumulative documentation.
As long as I know, fortran version HSPF has a limited no. of operation.
Therefore, there was a limitation on the application of LULC or Reach.
According to your program specification, there were no more fixed limits on operations.
" Restructure for maintainability, to remove fixed limits (e.g., operations, land use,
parameters), and to maintain or improve execution time."
Then we don't need to construct the model separately
due to the functional limitation when applying HSP2.
Right?
Best regards.
Jiheon Lee.
I am running a for loop to on HSP2.run method, problem is, it shows warning message
2018-04-03 18:44:14.34 Message count 2 Message HYDR: extrapolation of rchtab will take place
2018-04-03 18:44:14.34 Message count 1 Message HYDR: Solve did not converge
But when I rerun the same file again, these messages are vanished. What more worse is that sometimes the warning message is vanished on third run and not even on second run. I am using following code to run the HDF5 file.
`
def ROR8D(hdfname): # getting calculated RO from HDF
bf_p1 = pd.read_hdf(hdfname, 'RESULTS/RCHRES_R009/HYDR')['RO']
outflow = bf_p1#*0.028316847 # convert cfs to cms
OutflowR8D = outflow.resample('D').mean()
#print('now total outflow is {}'.format(sum(OutflowR8D)))
if isinf(sum(OutflowR8D)): # check if any value of calculated RO is infinity,
print(sum(OutflowR8D[1:]))
print('infinity aa gya hai')
#sys.exit()
OutflowR8D = OutflowR8D.replace([np.inf, -np.inf], 0.0) # replacing infinity values with 0.0
print(sum(OutflowR8D))
return OutflowR8D
def ObjectiveFunction(xx): #xx is a list of input parameters
#changing parameters
df2 = pd.read_hdf(hdfname, '/PERLND/PWATER/PARAMETERS')
print('sum of all input parameters is {0:10.7f}'.format(sum(xx)))
df2.LZSN = xx[0]
df2.INFILT = xx[1]
df2.KVARY = xx[2]
df2.AGWRC = xx[3]
df2.DEEPFR = xx[4]
df2.BASETP = xx[5]
df2.AGWETP = xx[6]
df2.CEPSC = xx[7]
df2.UZSN = xx[8]
df2.INTFW = xx[9]
df2.IRC = xx[10]
df2.LZETP = xx[11]
df2.NSUR = xx[12]
df2.to_hdf(hdfname, '/PERLND/PWATER/PARAMETERS')
HSP2.run(hdfname, saveall=True)
OutflowR8D = ROR8D(hdfname)
TotalFlow = sum(OutflowR8D)
#print('\n total flow is {} '.format(TotalFlow))
while isnan(TotalFlow): # rerunning if run gave errors in first run
print(' ...........Rerunning............. ')
HSP2.run(hdfname, saveall=True)
OutflowR8D = ROR8D(hdfname)
TotalFlow = sum(OutflowR8D)`
@sjordan29's testing in PR #73 revealed that requesting a timeseries from the IOManager class that does not exist results in a KeyError causing the application to stop executing. I think the behavior should be to return an empty pandas.DataFrame and possibly raising a warning, but not stop execution.
Fully porting and testing these detailed WQ sections for the surface water module (RCHRES) would complete the port of surface water quality capabilities from HSPF.
These highly integrated modules for dissolved oxygen, nutrients, plankton, and the carbonate buffering system are challenging because of many interdependencies within a tilmestep.
To facilitate the passing of attributes and values among these modules, @PaulDudaRESPEC, @tredder75, @ptomasula, and I propose to migrate these code sections from functional-programming coding structures inherited from HSPF to object-oriented class structures used in modern programming, to enable inheritance of attributes and methods by objects and the passing of attributes among objects.
@tredder75 has begun this process with LimnoTech#38 and LimnoTech#39.
@tredder75 is also numba-fying these classes as he develops them, adding Numba just-in-time compiling to low level functions that operate many, many times during an HSP2 model run. This also helps toward our overall performance goals (#36).
We hope that the new class-based coding approach and specific WQ classes that we are implementing here will effectively serve as templates for converting the other modules to class-based code.
@jlkittle, this will get us moving toward the ability call attributes and methods using the Python "dot" syntax!
I am installing hsp2 on a Linux machine (CentOS v. 7) using the conda installation procedure with python 3.8. I created a custom conda environment, and ran the the conda-develop command to add the path to the directory with the HSPsquared download. However, when I try to run hsp2 within the custom environment, it's not finding the command. I checked in the site-packages folder, and the conda.pth file was created with the path. Any ideas on how to resolve this?
I know everybody and their uncle want me to use conda, but on Linux conda will overlay libraries that it installs in-place of system libraries. That isn't the way that I want things handled. I use package management for the system libraries (not conda) and I use pip for python and they don't overlap.
Pip and conda can coexist, because all that pip needs is a setup.py. I had contributed a setup.py to HSPsquared some time ago and it was deleted. Would a resurrected setup.py be accepted as a pull request?
Another advantage to having a setup.py is that it is also used to upload packages to the Python Package Index (https://pypi.org/) where HSPsquared could be available to anyone in the world by using "pip install HSPsquared".
Kind regards,
Tim
I am using hperwat function to simulate processes in a simple PERLND and calculate water balance using following lines of code
import pandas as pd
import numpy as np
np.random.seed(seed=77)
from perwat import pwater
tindex = pd.date_range('20110101', '20111210', freq='H')
general = {
'sim_len': len(tindex),
'tindex': tindex,
'sim_delt':60 # number of minutes in time step
}
ui = {
'CSNOFG': 0,
'RTOPFG': 1, #
'UZFG': 0, # algorithm used with `0` causes a lot of surface runoff (thus higher pero) at the expense of uzet
'VCSFG': 0,
'VLEFG': 0,
'IFFCFG': 0,
'IFRDFG': 0,
'FOREST': 0.0,
'LZSN': 0.25,
'INFILT': 1.0, # index to the infiltration capacity of the soil.
'LSUR': 1.0,
'SLSUR': 1.5,
'KVARY': 1.0, # affects the behavior of groundwater recession flow, enabling it to be non-exponential in its decay with time.
'AGWRC': 0.4, # basic groundwater recession rate if KVARY is zero and there is no inflow to groundwater; AGWRC is defined as the rate of flow today divided by the rate of flow yesterday.
'PETMAX': 4.0,
'PETMIN': 1.7,
'INFEXP': 2.0,
'INFILD': 2.0,
'DEEPFR': 0.1, # fraction of groundwater inflow which will enter deep (inactive) groundwater, and, thus, be lost from the system as it is defined in HSPF
'BASETP': 0.7, # fraction of remaining potential E-T which can be satisfied from baseflow (groundwater outflow), if enough is available.
'AGWETP': 0.5,
'FZG': 0.0394,
'FZGL': 0.1,
'LTINFw': 1.0,
'LTINFb': 0.0,
'CEPS': 0.0,
'CEPSC': 0.0, # interception storage capacity for this canopy layer. This parameter is only used if VCSFG = 0
'SURS': 0.0, # initial interception storage for this canopy layer.
'UZS': 0.0,
'UZSN': 0.025,
'IFWS': 0.0, # initial interflow storage.
'LZS': 0.025, # initial lower zone storage
'AGWS': 0.0,
'GWVS': 0.0,
'NSUR': 0.1,
'INTFW': 0.0,
'IRC': 0.25,
'LZETP': 0.0,
'VIFWFG': 0,
'VIRCFG': 0.1,
'VNNFG': 0,
'VUZFG': 0,
'HWTFG': False
}
prec = np.random.random(len(tindex))
ts = {
'PREC': prec,
'PETINP': np.add(prec, 0.5) # Input potential E-T
}
errorV, errorM = pwater(general, ui, ts)
ifwi = np.sum(ts['IFWI'])
deepfr = np.sum(ts['DEEPFR'])
pet = np.sum(ts['PET'])
# Input
_in = np.sum(ts['SUPY'])
# evapotranspiration
cepe = np.sum(ts['CEPE'])
uzet = np.sum(ts['UZET'])
lzet = np.sum(ts['LZET'])
agwet = np.sum(ts['AGWET'])
baset = np.sum(ts['BASET'])
_taet = cepe + uzet + lzet + agwet + baset
taet = np.sum(ts['TAET'])
d_et = taet-_taet
if -1e-4 > d_et>1e-4:
print('Problem in ET balance')
print('{:<10} {:<10} {:<10} {:<10} {:<10} {:<10}'.format('cepe', 'uzet', 'lzet', 'agwet', 'baset', 'taet'))
print('{:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f}'.format(cepe, uzet, lzet, agwet, baset, taet))
# Outflow
suro = np.sum(ts['SURO'])
ifwo = np.sum(ts['IFWO'])
agwo = np.sum(ts['AGWO'])
igwi = np.sum(ts['IGWI'])
pero = np.sum(ts['PERO'])
_pero = ifwo + agwo + igwi + suro
d_pero = pero-_pero
print('')
if -1e-4 > d_pero or d_pero>1e-4:
print('Problem in balance of outflow from PERLND')
print('{:<10.6} {:<10.6} {:<10.6} {:<10.6} {:<10.6}'.format('ifwo', 'agwo', 'igwi', 'suro', 'pero'))
print('{:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f}'.format(ifwo, agwo, igwi, suro, pero))
# Total storage
ceps = np.sum(ts['CEPS']) # Interception storage (for each
surs = np.sum(ts['SURS']) # Surface (overland flow) storage
uzs = np.sum(ts['UZS'])
ifws = np.sum(ts['IFWS'])
lzs = np.sum(ts['LZS'])
agws = np.sum(ts['AGWS']) # Active groundwater storage
tgws = np.sum(ts['TGWS']) # Total groundwater storage
pers = np.sum(ts['PERS']) # Total water stored in the PLS
_pers = ceps + surs + uzs + ifws + lzs + agws + tgws
d_pers = pers-_pers
print('')
if -5 > d_pers > 5:
print('Probelem in Storage balance')
print('{:<10.6} {:<10.6} {:<10.6} {:<10.6} {:<10.6} {:<10.6} {:<10.6} {:<10.6}'
.format('ceps', 'surs', 'uzs', 'ifws', 'lzs', 'agws', 'tgws', 'pers'))
print('{:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} {:<10.3f} '
.format(ceps, surs, uzs, ifws, lzs, agws, tgws, pers))
print('\nTOTAL WATER BALANCE')
print('{:<18} {:<18} {:<18}'.format('Preciptation', 'Evapotranspiration', 'Total Outflow'))
print('{:<18.3f} {:<18.3f} {:<18.3f} '.format(_in, taet, pero))
I am getting following outputs from this code
cepe uzet lzet agwet baset taet
0.000 465.502 0.000 1018.299 0.000 1483.801
Problem in balance of outflow from PERLND
ifwo agwo igwi suro pero
0.000 0.000 53.595 3073.925 3073.925
ceps surs uzs ifws lzs agws tgws pers
0.000 0.000 0.044 0.000 7885.728 0.000 0.000 7885.772
TOTAL WATER BALANCE
Preciptation Evapotranspiration Total Outflow
4076.324 1483.801 3073.925
which is not correct. Can you tell me where I am making mistake or is this the error in code?
I get the following error when I run
cell.
AttributeError: 'module' object has no attribute 'enable'
I am trying to explore the possibility of running hsp2 from the console in Linux -- anyone done this? I am thinking similar to how the old hspf version would be run where one supplies a uci name and execution flows from there.
This has been reported by somebody previously on stackoverflow.
https://stackoverflow.com/questions/43959585/pytables-ptrepack-cmd-not-accepting-absolute-path
Thanks
~A
@PaulDudaRESPEC @TongZhai @aufdenkampe @bcous
In trying to get Brendan's more complex WQ model to run we identified an issue with how UCI and WDM Readers handle multiple files that have timeseries with overlapping DSN values. This model contains 4 separate WDM files, some of which have conflicting DSNs between the timeseries. Presently the WDM reader overwrites timeseries with the same DSN with the most recently read timeseries.
HSPF appears to get around this issue by using the UCI file and its FILES specification. Later in the EXT SOURCES specification those file names (i.e. WDM1, WDM2) are used to distinguish between timeseries with the same DSN in different files. The UCI reader and WDM reader need to be expanded to capture and support this file naming.
It actually looks like this might have original been supported. Notably the parseD
function in the ReadUCI reads the file name and returns it as part of the dictionary under the SVOL
key. However it looks like line 416 then overrides that filename specification for an asterisk. I think removing that override should restore support when ReadUCI. Also the main.get_timeseries
function already looks to have logic to read timeseries when the SVOL parameter is populated.
I propose the following as a path here to restore support for multiple WMDs:
SVOL
in ReadUCISVOL/TS###
key to be consistent with the get_timeseries
function.This approach will still allow users to read WDM files independently of the UCI file so we would retain the ability to read and view WMD files without running the model. However for model execution we'd now need to read UCI files first and then read WDM file(s). I'm curious if you all have any thoughts or alternative suggestions on how to best address these conflicting DSN values. If we are comfortable with the proposed solution I can start working on it in a feature branch.
For a particular test case, in the benthic algae component of PLANK, the computed benthic algae density increases with each time step until it crosses the threshold set by input parameter MBAL in table BENAL-PARM, as expected. Crossing this threshold occurs one time step later in HSP2 than it did in HSPF. After closer examination, it is apparent the differences in benthic algae density are within the expected numeric precision, but the small difference is enough to affect the occurrence of overcrowding by one time step.
This issue focuses a possible shortcoming of the testing code, where comparing each value at each time step can provide misleading indications of non-matching results.
This spring @steveskrip noticed that many UCI files successfully used by LimnoTech with HSPF (and created by LimnoTech's WinModel package) would not import with readUCI
.
@rheaphy also noted that there might be time issues in UCI files, because HSPF doesn't really correctly manage time and for HSP2, we're using ISO time standards that track leap seconds and time zones.
Let's use this issue thread to track @rheaphy's work to improve readUCI
, and our results with testing it.
I am trying to run through the Intro_toHSP2.ipynb. I am able to link all directories/paths. I am able to process the HSPF inputs successfully
readUCI(input_uci_path, output_hdf5_path)....runs successfully
readWDM(input_wdm_path, output_hdf5_path)...runs successfully
However when I try to run the HSP2 simulation for Test10:
(main(output_hdf5_path, saveall=True)
I get : AttributeError: 'PosixPath' object has no attribute 'read_uci"
Any I dea how to resolve this error?
I am testing with a UCI that lacks both SCHEMATIC & NETWORK (it is from a version of the Chesapeake Bay model land simulation which simply routes output to a UCI via EXT TARGETS
block to be later run in a separate river-only UCI). This fails on uci import when calling the panda function concat
since both net
and sc
remain as None
. I can get the UCI import to complete by testing for if not ( (net is None) and (sc is None) ):
before running the linkage output to the hdf, but wondering if anyone sees a problem with this before I go forward with forking and adding a pull request to the new code.
Basically, I swap out the original (near line 160 in readUCI.py):
linkage = concat((net, sc), ignore_index=True, sort=True)
for cname in colnames:
if cname not in linkage.columns:
linkage[cname] = ''
linkage = linkage.sort_values(by=['TVOLNO']).replace('na','')
linkage.to_hdf(store, '/CONTROL/LINKS', data_columns=True)
For this:
if not ( (net is None) and (sc is None) ):
linkage = concat((net, sc), ignore_index=True, sort=True)
for cname in colnames:
if cname not in linkage.columns:
linkage[cname] = ''
linkage = linkage.sort_values(by=['TVOLNO']).replace('na','')
linkage.to_hdf(store, '/CONTROL/LINKS', data_columns=True)
And so, in sum, no /CONTROL/LINKS
gets written to the hd5.
In GQUAL, the flags PHFLAG and PHYTFG can be used to indicate that input data to the module should come from timeseries computed in RQUAL modules PHYTO and PHCARB. In HSPF, the values of these state variables from the previous time step are available and used. However, because of differences in the time looping in HSP2, those computed time series are not yet available when needed.
A possible solution to this issue would be to change the sequence of simulated sections, so that RQUAL sections are computed before GQUAL.
There is a possible bug in 'demand' function in hrchhyd.py. Try running 'demand' function without jit with following input
vol = 0.0
rowFT = np.array([0., 0.01, 0., 0., ])
funct = (1,)
nexits = 1
delts = 3600.0
convf = 1.0
colind = [4.2]
outdgt = [2.1]
ODGTF = (0,)
and it will throw the error
File "E:/debug/test.py", line 51, in demand2
od[i] = _od1 + diff * (_od1 - rowFT[icol]) * convf #$2356
IndexError: index 4 is out of bounds for axis 0 with size 4
The problem with numba is that when we try to access a non-existent value in an array, it just gives a junk value like 1e-313 etc.
In the Colby sand load method down in SEDTRN, the log10 function returns NaN in this code, when using the Anaconda python 3.9.7. I've verified that it returns a good value in python 3.7, and it even returns a good number when I comment out the ‘njit’ declaration. It’s just the numba compiled version that returns the NaN. To make it even more perplexing, when I add the print statement below (which I did as I was trying to debug), it works just fine in all versions, even with numba.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.