GithubHelp home page GithubHelp logo

glaciohack / geoutils Goto Github PK

View Code? Open in Web Editor NEW
78.0 9.0 19.0 5.76 MB

Analysis of georeferenced rasters and vectors

Home Page: https://geoutils.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 99.94% Shell 0.06%
geopandas geospatial-processing gis python raster-data rasterio vector-data masking polygonization rasterization

geoutils's People

Contributors

adehecq avatar adrienwehrle avatar atedstone avatar dependabot[bot] avatar elischwat avatar erikmannerfelt avatar friedrichknuth avatar iamdonovan avatar jlandmann avatar rhugonnet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geoutils's Issues

Branch naming and require forks

This is advance warning that once Hack number 2 is over there will be some administrative changes:

  • dev renamed to master
  • main renamed to stable
  • no more pushing to branches of the main repository
  • all commits must be made in fork of the main repository, and then a PR opened against master.
  • PR merges into master will probably mandate "squash and merge" commits (I may need to discuss further with @fmaussion on this)

What does this mean for you?

  • Delete your local machine copy of geoutils
  • On GitHub, fork the GlacioHack geoutils repo to your personal account
  • git clone your fork
  • always work in feature branches
  • git push back to your own fork
  • then open a PR for the branch on your own fork to be merged into master.

Amongst other things, this will simplify the main repository's history, and bring us more into line with "best practises" adopted elsewhere.

How to expose demo data sets

e.g. for use in notebooks etc, it would be good to streamline the opening/paths to the demo datasets. These can be the same as those used in testing, but we shouldn't need to rely on the test suite to tell us where demo data is....

Suggestion: Raster.reproject() should only reproject if needed

If the user wants one DEM to fit another, they would presumably use the .reproject() method directly. If the DEMs already fit, however, the reprojection is still done. This adds an unnecessary processing time (or even failure in case of gigantic DEMs), and could be avoided with a sanity check within the .reproject() method.

To exemplify what I mean:

import time

import geoutils as gu
import rasterio as rio

# Load a KH-9 DEM sent by Amaury which is quite big
dem = gu.georaster.Raster("DZB1208-500032L004006/DZB1208-500032L004006-DEM.tif")

# I reproject the DEM to its own exact bounds using cubic spline
# This is a proxy for if a DEM with the exact same bounds as a reference DEM should (not) be reprojected.
time_before = time.time()
dem2 = dem.reproject(dem, resampling=rio.warp.Resampling.cubic_spline)
time_after = time.time()
print(f"{time_after - time_before:.2f} s")

Which returned:

4.98 s

I would argue that this is a shortcoming or even bug in either georaster or rasterio. Why would resampling be done if the source and destination bounds/crs/resolution is the same?

Something along this could be added:

if dst_ref.bounds == self.bounds and dst_ref.res == self.res and dst_ref.crs == self.crs and dst_ref.nodata == self.nodata:
     return dst_ref

or the same with shorter lines:

if all([
    dst_ref.bounds == self.bounds,
    dst_ref.res == self.res,
    dst_ref.crs == self.crs,
    dst_ref.nodata == self.nodata
]):
    return dst_ref

This is not a huge issue, but I think the check is a valuable addition!

Deprecate old geoutils organisation (github.com/GeoUtils)

To do:

  • archive geoutils organisation and do necessary clearup, flag existence of this new project.
  • consider what to do with PyPI and conda versions of old geoutils org stuff
  • shift geoutils.readthedocs.io registration to GlacioHack/GeoUtils
  • GitHub actions for automatic documentation building/hook to readthedocs

Method to close links to underlying dataset reader

By default, there are at least two links to the underlying rasterio reader saved within a Raster object: .ds, .dataset_mask . .memfile may also count.

There are some edge cases when this is problem. The one I've identified is that Rasters with these readers still open cannot be pickled when they are being distributed to compute nodes in multiprocessing situations.

Manually calling a method to close these links once the Raster has been created is a possible solution to this problem.

Return value at given coordinates

Convenience wrapper, any coordinates, with optional associated crs if coordinates not in Raster crs, to look up value using rasterio functionality for this. like georaster.value_at_coords().

Raster class CRS object is invalid for some reason

I have a weird CRS issue with using Vectors and Rasters together:

I am trying to use the Vector exclusion_mask to create a mask for the Raster reference_raster:

mask_array = exclusion_mask.create_mask(reference_raster, crs=reference_raster.crs)

but this raises an error:

Traceback (most recent call last):                                                                                                                         
  File "/home/erik/Projects/ETH/GlacioHack/DemUtils/DemUtils/coreg.py", line 703, in <module>                                                              
    test_icp_coregistration()                                                                                                                              
  File "/home/erik/Projects/ETH/GlacioHack/DemUtils/DemUtils/coreg.py", line 683, in test_icp_coregistration                                               
    aligned_raster, error = coregistration.run(reference_raster, to_be_aligned_raster, glacier_mask)                                                       
  File "/home/erik/Projects/ETH/GlacioHack/DemUtils/DemUtils/coreg.py", line 654, in run                                                                   
    mask_array = exclusion_mask.create_mask(reference_raster, crs=reference_raster.crs)                                                                    
  File "/home/erik/.local/share/conda/demutils/lib/python3.9/site-packages/geoutils/geovector.py", line 151, in create_mask                                
    vect = self.ds.to_crs(crs)                                                                                                                             
  File "/home/erik/.local/share/conda/demutils/lib/python3.9/site-packages/geopandas/geodataframe.py", line 816, in to_crs                                 
    geom = df.geometry.to_crs(crs=crs, epsg=epsg)                                                                                                          
  File "/home/erik/.local/share/conda/demutils/lib/python3.9/site-packages/geopandas/geoseries.py", line 541, in to_crs                                    
    transformer = Transformer.from_crs(self.crs, crs, always_xy=True)                                                                                      
  File "/home/erik/.local/share/conda/demutils/lib/python3.9/site-packages/pyproj/transformer.py", line 368, in from_crs                                   
    _Transformer.from_crs(                                                                                                                                 
  File "pyproj/_transformer.pyx", line 349, in pyproj._transformer._Transformer.from_crs                                                                   
pyproj.exceptions.ProjError: Error creating Transformer from CRS

I have found a workaround which is very ugly but might aid in debugging:

# Extract the string version of the CRS
crs_string = reference_raster.crs.__repr__()
# Find the last EPSG entry (assuming there is one..)
crs_epsg = crs_string[crs_string.rindex("EPSG") + 7: crs_string.rindex("\"]]")]

reference_raster.crs = pyproj.crs.CRS.from_epsg(crs_epsg)

# Now this works
mask_array = exclusion_mask.create_mask(reference_raster)

The Vector CRS is ETRS89 UTM 33N and the Raster CRS is WGS84 UTM 33N

Raster's shift method fails if data is not loaded

For example:

from geoutils import georaster as geor
img = geor.Raster('tests/data/LE71400412000304SGS00_B4_crop.TIF', load_data=False)
img.shift(20,10)

returns an AttributeError,

AttributeError Traceback (most recent call last)
in
----> 1 img.shift(20,10)

~/development/GlacioHack/GeoUtils/geoutils/georaster.py in shift(self, xoff, yoff)
575 meta.update({'transform': rio.transform.Affine(dx, b, xmin + xoff,
576 d, dy, ymax + yoff)})
--> 577 self._update(metadata=meta)
578
579 def set_ndv(self, ndv, update_array=False):

~/development/GlacioHack/GeoUtils/geoutils/georaster.py in _update(self, imgdata, metadata, vrt_to_driver)
251
252 with memfile.open(**metadata) as ds:
--> 253 ds.write(imgdata)
254
255 self.memfile = memfile

rasterio/_io.pyx in rasterio._io.DatasetWriterBase.write()

AttributeError: 'NoneType' object has no attribute 'shape'

set and get methods for raster data array

At the moment, the raster data (if loaded) are stored in Raster.data. This is a publicly available np.array.

The problem with this is that data can be arbitrarily overwritten without any sanity check of whether the dimensions match the Raster. It could also cause problems with nodata values.

I propose that the raw np.array is moved to Raster._data. We keep the Raster.data accessor, but overload it with set/get methods.

Issue when loading a single band

When loading a single band with load (or self.ds.read), the output array has a dimension of 2, instead of 3 for all other cases. This causes issues later on, for example in set_ndv.

Raster cannot open VRT files

When trying to open a VRT file with georaster.Raster, fails with 'RasterioIOError: Read or write failed. No such file or directory'.
Not sure exactly where the error is, because the traceback does not make sense.

Cannot import proj_tools

from geoutils import projtools -> "ModuleNotFoundError: No module named 'shapely.ops.transform'; 'shapely.ops' is not a package"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.