dbekaert / raider Goto Github PK

View Code? Open in Web Editor NEW

69.0 69.0 38.0 359.71 MB

Raytracing Atmospheric Delay Estimation for RADAR

License: Apache License 2.0

Python 93.66% C++ 5.22% Cython 0.85% Dockerfile 0.26% Shell 0.01%

raider's People

Contributors

Stargazers

Watchers

raider's Issues

Verify constant values for the different weather models

STATS: Seaonal Amplitude analysis

Windows does not like files with `:` in the name

These files are giving me "invalid path" errors on Windows:

wrfout_d01_2010-06-25_00:00:00
wrfplev_d01_2010-06-25_00:00:00

and I think it's because of the colons. https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

We might want to rename these if we want to support Windows users in the future.

Formatting for GNSS option within RAiDER

This issue ticket concerns about the best way of storing and formatting the RAiDER output when levering the station list option over a period of time and specific time sampling. (The NISAR workign group will be leveraging GNSS for validation purposes of the troposphere zenith delays.)

How it fits together with other issue tickets:

Raider is to be called using GPS location information (lon, lat, hgt) in a CSV file (TBD). This information to be provided by RAiDER stats class. See #44.
RAiDER to be called for zenith delay calculation, see #37, example of the call below

raiderDelay.py --date 20200103 --time 23:00:00 --station_file stations.csv --model ERA5 --zref 15000 -v

example for NISAR:

Inputs to RAiDER:

Global GPS station list is provided
Time-period e.g. 2009-2019
delays to be calculated every 12 days
at a fixed UTC time: 00 UTC

outcome a time-series for each GPS location with delay estimated at 00UTC every 12 days over a 10 year period

Formatting Considerations

We have not set on the formatting for this time-series, but few items for consideration are

Ease for parallelization
Ease to append in time
File size
captuing of meta-data and handling of no-data
CF compliant (ease for loading in other programs)?
Compatibility with RAiDER Stats class (see interaction under #44)

Code-coverage though Pytest and coveralls

Per discussion with @Askaholic:

it woudl be good to include a coverage report as part of pytest.
This allows us to see how much of the unit-test cover the code; and would highlight gabs.
adding coveralls as application which would then test the coverage for new code to be added in each PR. It would take the coverage report from the py-test and visualized it in the PR like CircleCI. https://coveralls.io/

update intall notes and add coday

update the dependencies in the README
add codacy for code review

working zenith, model nodes, fixed height level example

Scenario:

Example: Running ERA5 at weather model nodes at fixed height over BBoX

Code used:

raiderDelay.py --date 20190415 --time 1200 --heightlvs 0 --model ERA5 --modelBBOX 36 -121 34 -119

Errors to be fixed:

re-run of hgt verus dem enabled
not an error: the code runs though and uses DEM if it was pre-existing. Does not use fixed height level.
Solution: By removing the "geom" folder the issue got removed.
see also #27. It mght be good to track what is in the geom folder an compare with a new call if its to be used or not.
Chunks size not of equal size
error: ValueError: "chunks" must have same rank as dataset shape

HDF5 must be used with height levels
Weather model already exists, please remove it ("weather_files/ERA5_2019-04-15 12:00:00.h5") if you want to create a new one.
Traceback (most recent call last):
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/bin/raiderDelay.py", line 13, in <module>
    parseCMD()
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/runProgram.py", line 146, in parseCMD
    (_,_) = tropo_delay(los, lats, lons, heights, flag, weather_model, wmLoc, zref,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 182, in tropo_delay
    writePnts2HDF5(lats, lons, hgts, los, pnts_file, in_shape)
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/utilFcns.py", line 766, in writePnts2HDF5
    start_positions = f.create_dataset('Rays_SP', (len(x),3), chunks = los.chunks)
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/h5py/_hl/group.py", line 136, in create_dataset
    dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 137, in make_new_dset
    dcpl = filters.fill_dcpl(
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/h5py/_hl/filters.py", line 132, in fill_dcpl
    rq_tuple(chunks, 'chunks')
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/h5py/_hl/filters.py", line 130, in rq_tuple
    raise ValueError('"%s" must have same rank as dataset shape' % name)
ValueError: "chunks" must have same rank as dataset shape

working Zenith, model node, DEM example

Scenario:

Example: Running ERA5 at weather model nodes at DEM height over BBoX

Code used:

raiderDelay.py --date 20190415 --time 1200  --model ERA5 --modelBBOX 36 -121 34 -119

Errors to be fixed:

re-run of hgt verus dem enabled
Error: not very clear for a user to know what the issue is, and its caused by a user driver error:
Solution: By removing the "geom" folder the issue got removed.

Weather model already exists, please remove it ("weather_files/ERA5_2019-04-15 12:00:00.h5") if you want to create a new one.
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:267: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  lengths = np.linalg.norm(f['LOS'].value, axis=-1)
Traceback (most recent call last):
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/bin/raiderDelay.py", line 13, in <module>
    parseCMD()
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/runProgram.py", line 146, in parseCMD
    (_,_) = tropo_delay(los, lats, lons, heights, flag, weather_model, wmLoc, zref,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 186, in tropo_delay
    computeDelay(weather_model_file, pnts_file, zref, out,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 111, in computeDelay
    wet, hydro = interpolateDelay(weather_model_file_name, pnts_file_name, zref = zref,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 69, in interpolateDelay
    RAiDER.delayFcns.calculate_rays(pnts_file_name, stepSize, verbose = verbose)
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py", line 288, in calculate_rays
    getUnitLVs(pnts_file) 
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py", line 250, in getUnitLVs
    get_lengths(pnts_file)
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py", line 273, in get_lengths
    f['Rays_len'][:] = lengths
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/h5py/_hl/group.py", line 264, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'Rays_len' doesn't exist)"

undefined getIntFcn2 function
Error: name 'getIntFcn2' is not defined

Weather model already exists, please remove it ("weather_files/ERA5_2019-04-15 12:00:00.h5") if you want to create a new one.
Getting the DEM
Beginning DEM download and warping
DEM download finished
Beginning interpolation
Interpolation finished
Saving DEM to disk
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:267: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  lengths = np.linalg.norm(f['LOS'].value, axis=-1)
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:252: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  f['Rays_SLV'][:,:] = f['LOS'].value / f['Rays_len'].value[:,np.newaxis]
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:243: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  f['Rays_SP'][:,:] = np.array(t.transform(f['lon'].value, f['lat'].value, f['hgt'].value)).T
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:201: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  wm_proj = CRS.from_json(f['Projection'].value)
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:78: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  xs_wm = f['x'].value.copy()
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:79: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  ys_wm = f['y'].value.copy()
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:80: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  zs_wm = f['z'].value.copy()
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:81: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  wet=f['wet'].value.copy()
/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py:82: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead.
  hydro=f['hydro'].value.copy()
Traceback (most recent call last):
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/bin/raiderDelay.py", line 13, in <module>
    parseCMD()
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/runProgram.py", line 146, in parseCMD
    (_,_) = tropo_delay(los, lats, lons, heights, flag, weather_model, wmLoc, zref,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 186, in tropo_delay
    computeDelay(weather_model_file, pnts_file, zref, out,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 111, in computeDelay
    wet, hydro = interpolateDelay(weather_model_file_name, pnts_file_name, zref = zref,
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delay.py", line 70, in interpolateDelay
    delays = RAiDER.delayFcns.get_delays(stepSize, pnts_file_name, 
  File "/Users/dbekaert/Software/python/miniconda37/envs/RAiDER/lib/python3.8/site-packages/RAiDER/delayFcns.py", line 84, in get_delays
    ifWet = getIntFcn2(xs_wm, ys_wm, zs_wm, wet)
NameError: name 'getIntFcn2' is not defined

Should cython be a required dependency for RAiDER?

As opposed to just writing some nice .cpp files and including those. Maybe it makes it easier, or not?

Replace cython with pybind11

One of the things that came up in the meeting with Michael Aivazis was the idea of writing python bindings using pybind11 instead of cython. I think this has some merit for a few reasons.

This project does a lot of tasks which are highly parallelizable over multiple threads. Implementing true multithreading is probably easier in pure C++ (although cython does have functionality for it).
We are barely using cython in the first place and it would require minimal effort to switch over right now.
The Geo2rdr code is already written in a "binding style". By that I mean, it is implemented in pure C++ and only uses cython to generate the python bindings. This is the problem which pybind11 is designed to fix.
If we do eventually write some GPU code, really our only option is to write that in CUDA, and then wrap it with some python bindings. Again, while this may be possible with cython, pybind11 is designed for exactly this use case.

Cons:

Setup might be a little harder. There is an example here https://github.com/pybind/python_example
Calling back to python code from the C++ code is more difficult

Change main delay calculation from simple for-loop to parallel for-loop

Code is tools/RAiDER/delayFcns.py:

RAiDER/tools/RAiDER/delayFcns.py

Line 123 in 8b397b1

with h5py.File(pnts_file, 'r') as f:

The code is chunked but it processes each chunk serially. A potentially major speed-up would be to add a parallel implementation here. Only reads are required; no data is written, but I'm not sure what happens if you try to open an HDF5 file in read-only mode (see perhaps here for a start). My hope is that we can just pass the file name to different processes (and if the file is chunked properly so we never try to read the same chunk) we can just use multiprocessing and do it that way. I.e. something like:

zipped_args = zip([filename]*len(other_args), other_arg_1, other_arg_2,...)
for arg_set in zipped_args:
    open the file in read-only mode and read the appropriate chunk
    do the delay calculation

But I don't know if HDF5 will actually allow that to work.

Fixing Zentith delay for all variants (point, grid, cube)

DEM download takes a long time

The DEM download can take the longest to run of the whole code.

Starting line in RAiDER/tools/RAiDER/demdownload.py:

RAiDER/tools/RAiDER/demdownload.py

Line 59 in 4c66748

memRaster = '/vsimem/warpedDEM'

Description

DEM download takes a long time. Not sure why this is happening; but could be that passing boundaries as a warp option is not as fast as cropping a VRT?

Relevant code

_world_dem = ('https://cloud.sdsc.edu/v1/AUTH_opentopography/Raster/'
              'SRTM_GL1_Ellip/SRTM_GL1_Ellip_srtm.vrt')
inRaster ='/vsicurl/{}'.format(_world_dem)
memRaster = '/vsimem/warpedDEM'
wrpOpt = gdal.WarpOptions(outputBounds = (minlon, minlat,maxlon, maxlat))
gdal.Warp(memRaster, inRaster, options = wrpOpt)
out = RAiDER.utilFcns.gdal_open(memRaster)

[BUG] STATS: Shapely required for raiderStats but not in environment.yml

Describe the bug
shapely needs to be added to conda environment file requirements

To Reproduce
Steps to reproduce the behavior:

create a clean conda environment using the environment.yml file
raiderStats.py -h
Traceback (most recent call last):
File "/Users/jlmd9g/software/miniconda3/envs/RAiDER/bin/raiderStats.py", line 10, in
from RAiDER.statsPlot import parseCMD
File "/Users/jlmd9g/software/miniconda3/envs/RAiDER/lib/python3.8/site-packages/RAiDER/statsPlot.py", line 10, in
from shapely.geometry import Polygon, Point
ModuleNotFoundError: No module named 'shapely'

Expected behavior
Print out the argparse help message

Desktop (please complete the following information):

RAiDER git tag: 559f80a
Jupyter Notebook on Mac

Geo2rdr returns nan line-of-sight when provided a state vector

Example RAiDER command:
raiderDelay.py --date 20200103 --time 23:00:00 -s state.txt -modelbb 40 -79 36 75 --model ERA5 --zref 15000 -v

where state.txt is a text file containing a single line:

20200103 311674.487622 2469192.551542 -6630254.862875 2388.91490 -6765.188049 -2408.216845

(i.e., date x y z vx vy vz)

The code of interest is in losreader.py, in the function state_to_los. Note: #15 only ensures that the inputs to state_to_los are correct, but it does not fix this issue.

The line "los_x, los_y, los_z = geo2rdr_obj.get_los()" is where the NaNs first appear.

def state_to_los(t, x, y, z, vx, vy, vz, lats, lons, heights):
from RAiDER import Geo2rdr

real_shape = lats.shape
lats = lats.flatten()
lons = lons.flatten()
heights = heights.flatten()

geo2rdr_obj = Geo2rdr.PyGeo2rdr()
geo2rdr_obj.set_orbit(t, x, y, z, vx, vy, vz)

loss = np.zeros((3, len(lats)))
slant_ranges = np.zeros_like(lats)


for i, (lat, lon, height) in enumerate(zip(lats, lons, heights)):
    height_array = np.array(((height,),))

    # Geo2rdr is picky about the type of height
    height_array = height_array.astype(np.double)

    import pdb; pdb.set_trace()
    geo2rdr_obj.set_geo_coordinate(np.radians(360-lon),np.radians(lat),1, 1,height_array)

    # compute the radar coordinate for each geo coordinate
    geo2rdr_obj.geo2rdr()

    # get back the line of sight unit vector
    los_x, los_y, los_z = geo2rdr_obj.get_los()
    loss[:, i] = los_x, los_y, los_z

    # get back the slant ranges
    slant_range = geo2rdr_obj.get_slant_range()
    slant_ranges[i] = slant_range

los = loss * slant_ranges

# Have to think about traversal order here. It's easy, though, since
# in both orders xs come first, followed by all ys, followed by all
# zs.
return los.reshape(real_shape + (3,))

STATS: Simplify code structure and optimize as standalone function [enhancement]

Purpose

Have STATS as a standalone, independent suite of codes that can be called separately from main RAIDAR code.

Proposed plan of action

Will consider ways to segregate certain functions of the code into separate scripts to simplify structure and avoid having everything in a single script. Open to suggestions on how to segregate functionality (e.g. plotting in one script, variogram generation in another, preprocessor/setup in main).

raiderStats help menu needs cleaned up

The help message is currently a bit cluttered and should be grouped into "plot types," "input options," etc.

Current functionality:

usage: raiderStats.py [-h] -f FNAME [-c COL_NAME] [-fmt PLOT_FMT]
                      [-cb CBOUNDS CBOUNDS] [-w WORKDIR] [-b BBOX] [-sp SPACING]
                      [-dt DENSITYTHRESHOLD] [-sg] [-dg]
                      [-cp COLORPERCENTILE COLORPERCENTILE] [-ti TIMEINTERVAL]
                      [-si SEASONALINTERVAL] [-station_distribution]
                      [-station_delay_mean] [-station_delay_stdev]
                      [-grid_heatmap] [-grid_delay_mean] [-grid_delay_stdev]
                      [-variogramplot] [-binnedvariogram] [-plotall] [-verbose]

Function to generate various quality control and baseline figures of the spatial-
temporal network of products.

optional arguments:
  -h, --help            show this help message and exit
  -f FNAME, --file FNAME
                        csv file
  -c COL_NAME, --column_name COL_NAME
                        Name of the input column to plot. Input assumed to be in
                        units of meters
  -fmt PLOT_FMT, --plot_format PLOT_FMT
                        Plot format to use for saving figures
  -cb CBOUNDS CBOUNDS, --color_bounds CBOUNDS CBOUNDS
                        List of two floats to use as color axis bounds
  -w WORKDIR, --workdir WORKDIR
                        Specify directory to deposit all outputs. Default is
                        local directory where script is launched.
  -b BBOX, --bbox BBOX  Provide either valid shapefile or Lat/Lon Bounding SNWE.
                        -- Example : '19 20 -99.5 -98.5'
  -sp SPACING, --spacing SPACING
                        Specify spacing of grid-cells for statistical analyses.
                        By default 1 deg.
  -dt DENSITYTHRESHOLD, --densitythreshold DENSITYTHRESHOLD
                        A given grid-cell is only valid if it contains this
                        specified threshold of stations. By default 10 stations.
  -sg, --stationsongrids
                        In gridded plots, superimpose your gridded array with a
                        scatterplot of station locations.
  -dg, --drawgridlines  Draw gridlines on gridded plots.
  -cp COLORPERCENTILE COLORPERCENTILE, --colorpercentile COLORPERCENTILE COLORPERCENTILE
                        Set low and upper percentile for plot colorbars. By
                        default 25% and 95%, respectively.
  -ti TIMEINTERVAL, --timeinterval TIMEINTERVAL
                        Subset in time by specifying earliest YYYY-MM-DD date
                        followed by latest date YYYY-MM-DD. -- Example :
                        '2016-01-01 2019-01-01'.
  -si SEASONALINTERVAL, --seasonalinterval SEASONALINTERVAL
                        Subset in by an specific interval for each year by
                        specifying earliest MM-DD time followed by latest MM-DD
                        time. -- Example : '03-21 06-21'.
  -station_distribution, --station_distribution
                        Plot station distribution.
  -station_delay_mean, --station_delay_mean
                        Plot station mean delay.
  -station_delay_stdev, --station_delay_stdev
                        Plot station delay stdev.
  -grid_heatmap, --grid_heatmap
                        Plot gridded station heatmap.
  -grid_delay_mean, --grid_delay_mean
                        Plot gridded station mean delay.
  -grid_delay_stdev, --grid_delay_stdev
                        Plot gridded station delay stdev.
  -variogramplot, --variogramplot
                        Plot gridded station variogram.
  -binnedvariogram, --binnedvariogram
                        Apply experimental variogram fit to total binned
                        empirical variograms for each time slice. Default is to
                        total unbinned empiricial variogram.
  -plotall, --plotall   Generate all above plots.
  -verbose, --verbose   Toggle verbose mode on. Must be specified to generate
                        variogram plots per gridded station AND time-slice.

Wanted functionality:

Adopt a similar approach as done for the raiderDelay.py

Height data. Default is ground surface for specified lat/lons, height levels otherwise:
  --dem DEM, -d DEM     Specify a DEM to use with lat/lon inputs.
  --heightlvs HEIGHTLVS [HEIGHTLVS ...]
                        A space-deliminited list of heights

Weather model. See documentation for details:
  --model MODEL         Weather model option to use: ERA5/HRRR/MERRA2/NARR/WRF/HDF5.
  --files FILES [FILES ...]
                        OUT/PLEV or HDF5 file(s)
  --weatherModelFileLocation WMLOC, -w WMLOC
                        Directory location of/to write weather model files

Add GMAO model support

refractor _uniform_z in weather model base class for other new weather model readers

This ticket is to refactor the _uniform_z function in the weather model base class WeatherModel, upon which other user-defined new weather models are built.

If looking at the newly-added GMAO model, it also has the few lines of performing interpolation along the vertical axis to convert the 3-D data cubes from a irregular vertical grid to a regular vertical grid.

So our goal is to refactor the _uniform_z function and use it in other weather model readers wherever this kind of interpolation is necessary.

However, the current _uniform_z function has some specific uses in it, which are not suitable for straightforward refactoring. These are:

self._e is used which is not usually (almost never) defined in weather model readers. In those readers, self._p (pressure), self._q (humidity) and self._t (temperature) are defined.
some outputs of _uniform_z function, i.e. self._xs, self._ys, and self._zs, are 1-D vectors. That is fine all by itself. But if we want to refactor it and use it in those weather model readers, the raiderDelay program will fail. Because the raiderDelay program in its current stage assume all weather model readers to return 3-D cubes rather than 1-D vectors for these variables. It is inside the weather model base class that these 3-D cubes get converted to 1-D vectors. So I think there is a input/output mismatch (incompatibility) between each weather model reader and weather model base class regarding this refactoring.

Add ERA5T model to RAiDER

Adding this here to track it as part of the millstone.
Has been added already, but should be verified.

Fix for LOS file and state vector delay calculations

Fix bug in RaiderDelay option using the LOS option.

state vector
LOS file

Traceback (most recent call last):
  File "tools/bin/raiderDelay.py", line 20, in <module>
    parseCMD()
  File "/lib/python3.7/site-packages/RAiDER/runProgram.py", line 151, in parseCMD
    outformat, t, out, download_only, verbose, wfn, hfn)
  File "/lib/python3.7/site-packages/RAiDER/delay.py", line 164, in tropo_delay
    lons=lons, los=los, zref=zref, time=time, verbose=verbose, download_only=download_only)
  File "/lib/python3.7/site-packages/RAiDER/processWM.py", line 85, in prepareWeatherModel
    weather_model.load(f, outLats=lats, outLons=lons, los=los, zref=zref)
  File "/lib/python3.7/site-packages/RAiDER/models/weatherModel.py", line 148, in load
    self._runLOS(los, zref, los_flag)
  File "/lib/python3.7/site-packages/RAiDER/models/weatherModel.py", line 200, in _runLOS
    ray_x, ray_y, ray_z = t.transform(ray[..., 0], ray[..., 1], ray[..., 2], always_xy=True)
IndexError: index 1 is out of bounds for axis 4 with size 1

This would need to be fixed in order for @Askaholic to continue with refactoring.

Documentation for adding new weather models

Add the MERRA2 model support

Updates to raiderDelay.py argparse

##Update for user-friendliness needed in argparse:

Provide example for each input. E.g. ISO 8601 format not clear to all users. (e.g. multiple variants for parsing day and time exist online, clarify with e.g. YYYMMDD)
Specify to which reference the DEM and reference the height levels are. i.e. ellipsoid, geoid and corrected for ellipsoid etc
The options used to be a bit more nested. i.e. LOS or statevector or zenith; there is no mention anymore of zenith option. Correct in argparse visualization
The options used to be a bit more nested: i.e. DEM or levels. Correct in argparse visualization
Unclear what the pickle file option is, lacks description and might be only relevant for testing.
If the parallel option does not work, comment it out.
Also add which models and how to specify them. e.g. its caps sensitive.
For now only list the once work itself.
im not sure if –modelBBOX is the right word. Rather see it as a region bbox, where internally the model bbox is extended such interpolation works.
Add in multiple run example

Zenith for ERA5 at model grid nodes at topo
Zenith for ERA5 at model grid nodes at fixed height level
los for ERA 5 using Sentinel-1 orbit at model grid nodes at topo
los for ERA 5 using ISCE los file at model grid nodes at topo

STATS: verify variogram sill

The sill value looks off, need to verify what is happening.

Save scenario output in a pre-defined folder and exclude folder in gitignore

The raider scripts create quite a number of files in various directories which clutter up the git status output and make it hard to find the files that you've edited. This is even true for some of the unit tests, and the test commands in the CI config. It would be a lot better if any commands that were checked into the repo did not create any new untracked files, so that those commands can be used for testing/development without cluttering up the git workspace.

I would suggest adding some sort of generic output directory which the checked in commands write to so we can cover all generated files with a single line in the gitignore.

STATS: Jupyter examples

Looking for the following notebooks (under RAiDER-docs github repo):

Example 1 (Priority)

Get GPS information and visualize station distribution for fixed period of time
this is inline with #11

Example 2

Get GPS information and visualize delays for a fixed period in time, calculate basic statistics
Use the GPS station information to trigger RAiDER and compute delays
use the RaiDER and GPS delays for basic statistical analysis.

Add documentation and code to convert ARIA-tools output to los file expected by RAiDER

Speed-up of downloadgnssdelays.py and statsPlot.py

Proposed Enhancement
Accessing delays from UNR and storing it in a CSV file through downloadgnssdelays.py, and thereafter loading this file and prepping it for plotting through statsPlot.py is still rather slow and inefficient, especially over a global scale. The latter in my experience can take over a day, while the former can take ~20 mins.

Specifically, I wish to explore ways to parallelize certain functions where appropriate (e.g. appending delays to a CSV file and splitting data into grid-cells of a given spacing) or perhaps developing more efficient alternatives to existing routines (e.g. plotting maps of delays directly from a pandas dataframe instead of passing a gridded array meant to serve as a pseudo-raster).

Suggested baseline tests
Instead of running global/decadal tests to start out, I suggest running jobs over a more moderately scale spanning California and 2 years.
Specifically as so:
###Generate CSV of delays across all stations spanning California between 2016 and 2018 at 06:00 UTC
raiderDownloadGNSS.py --out CA_test -y '2016,2018' --returntime '00:06:00' -b '31 42 -125 -113'

###Generate figures of the station distribution and delay heatmaps
raiderStats.py -f CA_test/CombinedGPS_ztd.csv -w CA_test/maps -ti '2016-01-01 2018-12-31' -grid_heatmap -station_distribution -grid_delay_mean -grid_delay_stdev -b '31 42 -125 -113'

PS, sorry for posting a blank ticket earlier. My network went down for a while and it was somehow submitted prematurely.

Implement option for convetional slant delay computation

This option would be calculation of slant delay by first computing the zenith delays and projection towards the slant.

Modify pyproj calls for consistency

pyproj has started respecting the order of axes for different CRS. EPSG:4326 is lat/lon now and not lon/lat like it was before.

To keep traditional behavior, add

always_xy=True

to all calls of pyproj.transform. This might also be causing issues that you recently encountered.

Circle-CI set-up

Getting started: https://circleci.com/docs/2.0/getting-started/
Starting from basics:
a. https://circleci.com/docs/2.0/first-steps/
b. https://circleci.com/docs/2.0/hello-world/
Getting started with real projects
a. Pick a good docker image – ubuntu:bionic (18.04)
b. Bake anaconda into it and install requirements (https://stackoverflow.com/questions/58243255/circleci-testing-with-specific-miniconda-python-and-numpy-versions
c. Add steps to then build your software and run tests
d. For reference, see isce config.yml - https://github.com/isce-framework/isce2/blob/master/.circleci/config.yml
i. This uses an ARIA base image so ignore that and use ubuntu
ii. The jobs named “test” is the only one of interest. The other ones are for automatically building docker images for ARIA on merge of PRs and on new releases – not needed yet
iii. Isce2 config.yml should be self explanatory hopefully

Logging

Logging support to prepare the code to be ready for production like environments.
We could adopt a similar strategy as used in ISCE2 or ISCE3, the latter being production code for NISAR.

Internal Python logging (ISCE2) option

ISCE2 uses python’s built-in logging module, set up using the ISCE2-specific configuration file at defaults/logging/logging.conf. https://github.com/isce-framework/isce2

Pyre Journal logging (ISCE3) option - Preferred

Would be leveraging the journal module of Pyre. Notes by @aivazis: See https://github.com/pyre/pyre under tests/journal.lib/_example.cc for examples of how to use the journal in c++, and tests/journal.api/_example.py for the corresponding python. Example folder is here

In general, messages are logged into channels. Each channel has a severity and a name. The severity is a compile time attribute, determined by the constructor you invoke. The name is a runtime choice.

For example:
In c++

    #include <pyre/journal.h>
    …
    void foo() {
        …
        // make a channel
        pyre::journal::info_t channel(“isce3.geocode”);
        // say something
        channel
            << “Hello world!”
            << pyre::journal::endl(__HERE__); // the macro is provided by journal and logs source file, line, function name

or for Python:

    import journal
    …
    def foo():
        …
        # make a channel
        channel = journal.info(name=“isce3.geocode”)
        # say something
        channel.log(“Hello world!”) # source, line, function name are automatically extracted from the python stack trace

There are five severities. Three (error, warning, info) are end-user facing, the other two (firewall, debug) are meant for developers and are optimized away for release build. Firewall and error are fatal: they raise exceptions when anything is written to them. You can make them non-fatal on a per channel basis or in bulk, or catch the exception and do something with it. Debug is the developer version of info, firewall is reserved for reporting serious constraint violations, so that a firewall that fires is an indication that the code has self-caught a bug.

Channels of the same severity+name all share a common state, across both the python and c++ codebase. This means that you can instantiate a channel in your main python script, configure it to send output to a file, and become fatal, and then access a channel by the same severity+name far away in the C++ code and have the output go to the same file without having to pass the channel around as application context. The names work best when you do a bit of namespace design for them ahead of time. They are hierarchical, with scopes separated by dots. Something like “isce3.geocode.projection” would be a good name for a channel.

Implement unit tests that don't require keeping data around

Create specified folders if they do not exist prior

Right now if you specify an output folder or weather file location and it does not exist, the code errors out. Should update this to create if does not exist, because it downloads the new file anyway.

Lost Zenith calculation at input lat/lon nodes

Lost this somewhere with the new raytracing modularity

Add a checkArgs equivalent for calling RaiDER from ipython

Add codes for creating interferometric delays to RAiDER

refractor input parsing to be modular and assert correct inputs

@Askaholic discussed in other issue tickets to remove duplication on shared input arguments and their syntax between different programs.

In addition we discussed capturing driver errors early one. So, should we also implement a verification that input arguments are supplied correctly, directly after parsing?

e.g. assumption on date order for seasonality?
e.g. verifying dates and time are in correct and recognizable format

Lets discuss below.

Document Adding New Weather Model

We need to create documentation for how to add new weather models to RAiDER. The following is a start that should be included into a notebook and/or a template file.

Data model

Data are encapsulated as a WeatherModel object with a uniform x/y/z grid with lat, lon, temperature, pressure, and humidity specified as 3D cubes at the weather model nodes
The data cubes are structured such that x is the coordinate for axis 0, y for axis 1, etc. and z increases with index

Requirements for adding a new module

The only job of the user-defined module is to get the required data variables from the API and into the correct format (3D cubes with uniform x/y coordinates at the specified levels), everything else is handled within the base class
Data are expected to be a fixed, uniform grid in x/y with grid nodes specified in the native weather model projection
Either pressure (fixed pressure) or model (fixed height) levels may be specified in the z-direction.
Required data variables for the WeatherModel base class are temperature, relative or specific humidity, and pressure associated with each point
Pressure comes directly from the pressure levels if using those or, if using model levels, must be calculated using surface pressure + geopotential heights

Model specification

The base WeatherModel class names all required parameters that can be specified by individual models. In addition, certain base class methods (e.g., "load_weather", "_fetch") are overloaded by each model type.
Adding a new model type requires creating a new model class definition with the appropriate parameter values for the calculations, method definitions for accessing any requisite API or file type, and data pre-processing to match the required data model given above
Documentation is required to define which parameters are required/optional, which have default or binary values, and describing the data model. Ideally a template model should exist that could be filled in. Information on the projections and transformations used would also be helpful.

Check datatypes whether float64 is required

Put the jupyter documentation in a seperate Repo

update repo README or raider repo
put docs in seperate repo
add documentation to docs repo

plotting weather model debug plots.

Adding in a capability to plot the weather model at low and higher elevation. this allows us to verify P,T,e to see if it looks as expected.

Refactoring notes for cleaner code

Some simple things which I'm noticing while reading the code. I think these things will go a long way in cleaning up the code.

Place "high level" functions at the TOP of the file and "low level" functions towards the bottom. This should generally coincide with the order in which the functions are called, so for example:

def some_high_level_function():
    """Reads stuff from files and calls other functions"""

    data = read_from_files()
    some_process(data)


def read_from_files():
    ...


def some_process(data):
    ...

This allows someone who is reading the code to follow the flow a lot more clearly. Right now it's a bit of a "find in project" situation for every function.

Never commit commented out code. Ever. Just delete it, it is saved in the git history if we ever do need to go back to it (chances are we wont). For experimental features, use a new branch and merge changes back to dev once they have been solidified.
Keep "read from file" operations separate from data processing. This will also help with the "production readiness" of the project. As I illustrated in point 1, if some third party user had data stored in a weird format, they could write their own read_from_files function which loads the data into the correct data structure, and call some_process with that correctly formatted data. This will also help us write code to support multiple data formats out-of-the-box more easily.

Need to add option to read a file containing a list of dates

STATS: GNSS archive support

The stats class can be seen as a pre- and post-processor for RAiDER.
Below are a few items that we plan to incorporate in the RAiDER stats class (not within Raider code-base itself) to interact with GNSS archives.

Functionality:

Ability to talk to the above GNSS provider to get station information for a given period in time that have zenith delays.

option 1) to only retrieve the lon lat hgt information over the specified period in time
option 2) to retrieve in addition the zenith delay information. over the specified period in time and the identified sampling in time

Formatting:

We should think it through on what the format would be for this. Both options should be compatible with RAiDER as a station list. We should probably also think about the formation on what RAiDER will output such the statistical class can welcome it. This needs allignment with @leiyangleon (see ticket #48).

Multiprocessing and Python 3.8 #

@leiyangleon saw this here on ISCE forum. Not sure if its relevant but post it here. isce-framework/isce2#146 (comment)

Towards GPU support

This issue ticket is meant to capture notes towards supporting GPU. Feel free to capture thoughts/considerations.

Basics:

@rtburns-jpl pointed to this NVIDIA blogpost on a raytracing tutorial to CUDA: https://devblogs.nvidia.com/accelerated-ray-tracing-cuda/

GPU device handling

Pyre is being considered for logging but also has many other features. @aivazis any insight for Pyre for handling GPU talking to different devices and scheduling?

When calling geo2rdr, line-of-sight vectors returned have larger y-values than z-values, possible reference frame issue?

See

RAiDER/tools/RAiDER/losreader.py

Line 83 in 1bef6a5

los_x, los_y, los_z = geo2rdr_obj.get_los()

--> example lat/lon/z: 38.6/-79.4/-100 on UTC20200103T23:00:00

results in: (los_x, los_y, los_z):
(array([[0.24365994]]), array([[0.96936308]]), array([[0.03106531]]))

@piyushrpt could this be an issue with reference frame? Orbits are fixed earth, but line-of-sight reference frame is not specified that I could find.

dbekaert / raider Goto Github PK

raider's People

Contributors

Stargazers

Watchers

Forkers

raider's Issues

How it fits together with other issue tickets:

example for NISAR:

Formatting Considerations

Scenario:

Errors to be fixed:

Scenario:

Errors to be fixed:

Starting line in RAiDER/tools/RAiDER/demdownload.py:

Description

Relevant code

Purpose

Proposed plan of action

Current functionality:

Wanted functionality:

Example 1 (Priority)

Example 2

Internal Python logging (ISCE2) option

Pyre Journal logging (ISCE3) option - Preferred

Data model

Requirements for adding a new module

Model specification

Archives:

Functionality:

Formatting:

Basics:

GPU device handling

Recommend Projects

Recommend Topics

Recommend Org

Jobs