nmandery / h3ronpy Goto Github PK
View Code? Open in Web Editor NEWA data science toolkit for the H3 geospatial grid
A data science toolkit for the H3 geospatial grid
Hi,
I need to find exact neighbor position, and we can do it with local ij methods.
In h3-py, there are methods
experimental_h3_to_local_ij
experimental_local_ij_to_h3
(source code: https://github.com/uber/h3-py/blob/v3/src/h3/_cy/cells.pyx)
It would be nice if we can have those methods also ih h3ronpy.
Hi,
I have a raster where a large area is represented by nan values. However, nans have a meaning (as in is not possible to get a result in that area) so I would like to index them.
When using raster_to_dataframe
, regardless of the value I give to nodata_value
, nans are not indexed.
Is there anything I can do to be able to index nan areas?
Thanks
Jorge
Hi @nmandery, I have a large polars Dataframe with a struct column of WGS84 latitude and longitude. I want to map the Points to corresponding H3 cells.
Naturally using the uber h3 python library with pl.struct(['lat', 'lng']).apply()
is quite slow. I found your library, but it looks like its only made to work with geo-type data. Is there a workaround outside of polars -> geopandas -> polars?
Good morning. First and foremost, congrats for this library, it is a joy to use! I've been playing around the Polars functions and wondered if those could be used somehow with the Expressions
api in polars.
To do a group_by
parent cell one can do (maybe not the best approach at all)
df.with_columns(
pl.col("h3index").map_batches(lambda x: change_resolution(x, h3res))
)
.group_by("h3index")
.agg(pl.col("value").sum())
But if somehow we could make change_resolution
part of the expressions api this could be done like
df.with_columns(
pl.col("h3index").h3.change_resolution(h3res)
)
.group_by("h3index")
.agg(pl.col("value").sum())
My first question should be if my assumption is correct and the polars functions in this lib must be treated as user defined functions in order to integrate them in polars or there's better way to use h3ronpy's functions in polars that I'm not seeing?
hmm maybe this is more of a polars related question than h3ronpy's but anyway, here it is !
Thanks!
There are multiple functions for changing cell resolutions, but having one implementation which
I was testing the library with different geometries, and I found out that points aren't properly matched to their H3 cells. I could filter out the points and parse them using official H3 bindings, but maybe there is an option to properly parse them in this library in a full package. I haven't tested the linestrings / multilinestrings and geometry collections yet.
from h3ronpy.pandas.vector import geometry_to_cells
from shapely.geometry import Point
import h3
# Manhattan Central Park
point = Point(-73.9575, 40.7938)
h3.int_to_str(geometry_to_cells(point, 8)[0])
# 8875588a83fffff - random cell near Null Island (0, 0)
h3.latlng_to_cell(point.y, point.x, 8)
# 882a1008d7fffff - proper cell
Currently blocked by readthedocs/readthedocs.org#10466
While messing with GitHub actions is mostly no fun, it would be good to provide binaries for all OS and architectures pyarrow supports: https://pypi.org/project/pyarrow/#files
The workflow of wonnx seems to be a good example: https://github.com/webonnx/wonnx/blob/master/.github/workflows/python-package.yml
@nmandery when executing
geodataframe_to_cells(gdf, H3_RESOLUTION, containment_mode=ContainmentMode.IntersectsBoundary)
shouldn't at least one H3 cell always be returned regardless of resolution? I'm trying to compute the set of H3 cells that completely covers the geometries in gdf
. I have an example running in a notebook where an empty geodataframe is returned if the H3 resolution goes below a certain threshold. This is not what I expected when the IntersectsBoundary
containment mode is selected.
I am a new polars user and I am curious how do I use the coordinates_to_cells
function in a lazy context?
If I do what I think needs to be done I get an error TypeError: 'Expr' object is not iterable
I can achieve my goal in the eager way. But hoping I can do this with the lazy api?
import polars as pl
from h3ronpy.polars.vector import coordinates_to_cells
# Sample Polars DataFrame with latitude and longitude
data = {
"x": [-74.0060, -118.2437, -87.6298], # 'x' for longitude
"y": [40.7128, 34.0522, 41.8781], # 'y' for latitude
}
res = 8
df = (
pl.DataFrame(data)
.lazy()
.with_columns(
coordinates_to_cells(pl.col("x"), pl.col("y"), resarray=res)
.h3.cells_to_string()
.alias(f"h3_{res}")
)
)
Importing h3ronpy in a Jupyter notebook (version 7) is causing a kernel failure on my M1 MacBook Pro. I'm not totally surprised now that I'm no longer running my Python stack in emulation mode. I dropped down to a terminal and did the import at an iPython prompt and got the following message:
zsh: illegal hardware instruction ipython
The import is working without issue on my Intel-based iMac so I'll proceed there, but wanted to share this.
Currently only manylinux_2_24
wheels and the source distribution is pushed to pypi.
As the installation of the gdal-dependencies is a bit hairy, this could maybe implemented using miniconda
I get this error while testing the uploaded example. Appears to happen in line:
vegetation_h3_df.plot(column="value", linewidth=0.2, edgecolor="black", **vegetation_plot_args)
CALLBACK error:
OSError Traceback (most recent call last)
in
6
7 print("plotting ... this may take a bit")
----> 8 vegetation_h3_df.plot(column="value", linewidth=0.2, edgecolor="black", **vegetation_plot_args)
9 pyplot.show()
~\AppData\Roaming\Python\Python38\site-packages\geopandas\plotting.py in call(self, *args, **kwargs)
948 kind = kwargs.pop("kind", "geo")
949 if kind == "geo":
--> 950 return plot_dataframe(data, *args, **kwargs)
951 if kind in self._pandas_kinds:
952 # Access pandas plots
~\AppData\Roaming\Python\Python38\site-packages\geopandas\plotting.py in plot_dataframe(df, column, cmap, color, ax, cax, categorical, legend, scheme, k, vmin, vmax, markersize, figsize, legend_kwds, categories, classification_kwds, missing_kwds, aspect, **style_kwds)
663 if aspect == "auto":
664 if df.crs and df.crs.is_geographic:
--> 665 bounds = df.total_bounds
666 y_coord = np.mean([bounds[1], bounds[3]])
667 ax.set_aspect(1 / np.cos(y_coord * np.pi / 180))
~\AppData\Roaming\Python\Python38\site-packages\geopandas\base.py in total_bounds(self)
2582 array([ 0., -1., 3., 2.])
2583 """
-> 2584 return GeometryArray(self.geometry.values).total_bounds
2585
2586 @Property
~\AppData\Roaming\Python\Python38\site-packages\geopandas\array.py in total_bounds(self)
913 # TODO with numpy >= 1.15, the 'initial' argument can be used
914 return np.array([np.nan, np.nan, np.nan, np.nan])
--> 915 b = self.bounds
916 return np.array(
917 (
~\AppData\Roaming\Python\Python38\site-packages\geopandas\array.py in bounds(self)
905 @Property
906 def bounds(self):
--> 907 return vectorized.bounds(self.data)
908
909 @Property
~\AppData\Roaming\Python\Python38\site-packages\geopandas_vectorized.py in bounds(data)
935 # as those return an empty tuple, not resulting in a 2D array
936 bounds = np.array(
--> 937 [
938 geom.bounds
939 if not (geom is None or geom.is_empty)
~\AppData\Roaming\Python\Python38\site-packages\geopandas_vectorized.py in (.0)
936 bounds = np.array(
937 [
--> 938 geom.bounds
939 if not (geom is None or geom.is_empty)
940 else (np.nan, np.nan, np.nan, np.nan)
~\AppData\Roaming\Python\Python38\site-packages\shapely\geometry\base.py in bounds(self)
473 return ()
474 else:
--> 475 return self.impl'bounds'
476
477 @Property
~\AppData\Roaming\Python\Python38\site-packages\shapely\coords.py in call(self, this)
185 def call(self, this):
186 self._validate(this)
--> 187 env = this.envelope
188 if env.geom_type == 'Point':
189 return env.bounds
~\AppData\Roaming\Python\Python38\site-packages\shapely\geometry\base.py in envelope(self)
498 def envelope(self):
499 """A figure that envelopes the geometry"""
--> 500 return geom_factory(self.impl'envelope')
501
502 @Property
~\AppData\Roaming\Python\Python38\site-packages\shapely\topology.py in call(self, this, *args)
78 def call(self, this, *args):
79 self._validate(this)
---> 80 return self.fn(this._geom, *args)
OSError: exception: access violation reading 0x00000000000000A0
Any ideas for this?? Thank you!!
Hi! sometimes I run into the same issue with global raster that trigger the Input array spans more than the bounds of WGS84 - input needs to be in WGS84 projection with lat/lon coordinates
exception. In this example I'm using a global poopulation raster from sedac that has this shape and transform:
>>> src.shape
(4320, 8640)
>>> src.transform
Affine(0.0416666666666667, 0.0, -180.0,
0.0, -0.0416666666666667, 89.99999999999994)
Looking at the source check_wgs84_bounds
I can't see any issue with the checks but when I do the computation manually the floating point curse strikes:
>>> transform.a * shape[1]
360.0000000000003
>>> transform.a * shape[0]
180.00000000000014
Both values will fail the check in check_wgs84_bounds
. I did not debug the rust code but I bet something like this is happening under the hood. Do you know any workaround to do on the the user inputs to avoid this issue (like casting the transform to ints or something)? What I normally do is clipping a few border pixels but it is not ideal.
Thank you!!
>>> import polars as pl
>>> from h3ronpy.polars import cells_parse
>>> cells_parse(pl.Series(["801ffffffffffff"]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nico/.cache/pypoetry/virtualenvs/basp-ingest-bc7lia5F-py3.11/lib/python3.10/site-packages/h3ronpy/polars/__init__.py", line 20, in wrapper
result = func(*args, **kw)
File "/home/nico/.cache/pypoetry/virtualenvs/basp-ingest-bc7lia5F-py3.11/lib/python3.10/site-packages/h3ronpy/arrow/__init__.py", line 66, in cells_parse
return op.cells_parse(_to_arrow_array(arr, pa.utf8()), set_failing_to_invalid=set_failing_to_invalid)
ValueError: Expected arrow2::array::utf8::Utf8Array<i32>, found arrow array of type LargeUtf8
>>> cells_parse(pl.Series(["801ffffffffffff"]).to_arrow())
shape: (1,)
Series: '' [u64]
[
577023702256844799
]
... once nmandery/h3arrow#3 has been merged.
Hi. Firstly this is a great library to work with h3indexes.
However, I have a weird error that I faced when trying to pip install h3ronpy
in python3.7
.
The following are the last lines of the error!
` Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [19 lines of output]
Checking for Rust toolchain....
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
main()
File "/home/airflow/.local/lib/python3.7/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/airflow/.local/lib/python3.7/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
return hook(metadata_directory, config_settings)
File "/tmp/pip-build-env-4_vsukfz/overlay/lib/python3.7/site-packages/maturin/init.py", line 140, in prepare_metadata_for_build_wheel
output = subprocess.check_output(["cargo", "--version"]).decode(
File "/usr/local/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/usr/local/lib/python3.7/subprocess.py", line 488, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/local/lib/python3.7/subprocess.py", line 800, in init
restore_signals, start_new_session)
File "/usr/local/lib/python3.7/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
PermissionError: [Errno 13] Permission denied: 'cargo'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.`
I found this weird because the other I could install the package in python3.9 without anyissues.
Even though the error says permission denied: 'cargo'
, I think there is a python dependency issue.
Thanks!
I am encountering an issue with inconsistent hexagon resolution in the H3ron Uber H3 Raster library. Hexagons generated within the same image exhibit varying sizes, hindering accurate data analysis and interpretation.
I request your assistance in resolving this matter and achieving a uniform hex resolution throughout the image. Any guidance or support you can provide would be greatly appreciated. I am also willing to share examples or code snippets to aid in troubleshooting.
Could this library do the reverse operation for raster conversion? i.e., given a set of H3 indicies (at the same, or perhaps at mixed resolutions), and some property (e.g. elevation), produce a raster output.
Naïvely, this could be through conversion of the H3 set to a consistent resolution, conversion to geo boundaries, and then rasterisation. However I'm particularly thinking about a hypothetically efficient approach that avoids expanding the compacted (mixed-resolution) set of H3 indices to a common resolution before rasterisation (if this is actually possible).
I think one way to approach this would be to find the set of pixels covering the region (extent of the input H3 set), compute each pixel's H3 index at the highest appropriate resolution (the highest resolution cell in the input set), and then for each H3 cell in the input set, find the pixels that intersect using the H3 API. (There'd be undefined behaviour if the input H3 set included overlaps.) This avoids having to uncompact the input set, but whether or not that's actually less efficient than computing the H3 index for each pixel is unclear to me.
I'm not even convinced that this idea makes sense, given H3's heirarchical non-containment (children at >2 resolutions higher than a parent may be entirely outside of the (grand)parent's boundary). But since the conversion raster → H3 set is possible and sensible, I assume the reverse is, too.
Hi Nico,
when attempting to execute the module, the core dumps due to illegal instruction. (Possibly related to #26)
>>> import geopandas as gpd
>>> from h3ronpy.pandas.vector import geodataframe_to_cells
>>> shape = gpd.read_file("admin_bavaria.gpkg")
>>> shape
GID_1 GID_0 COUNTRY NAME_1 VARNAME_1 NL_NAME_1 TYPE_1 ENGTYPE_1 CC_1 HASC_1 ISO_1 area_adm1 geometry
0 DEU.2_1 DEU Germany Bayern Bavaria NA Freistaat Free State 09 DE.BY DE-BY 70531.935 MULTIPOLYGON (((8.92500 50.10593, 8.92493 50.1...
>>> df = geodataframe_to_cells(shape,3)
Illegal instruction (core dumped)
I'm using h3ronpy version 0.19.1
and installed it with pip version 22.0.2
. The OS is Ubuntu:
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
However, when installing the package from source, the error does not occur.
Do you have an idea how to fix this issue when using pip install
?
Best,
Johanna
Hi there!
Using raster.raster_to_dataframe throws an error in h3ronpy raster.py RuntimeError: operation failed
at
values, indexes = func(in_raster, _get_transform(transform), h3_resolution, axis_order, compacted, nodata_value)
The above line calls raster.rs
's raster_to_h3 function
the raster_to_h3 function
instantiates the H3Converter class from h3ron_ndarray
and there the to_h3
function is called (link to Github source).
The error thrown is defined in h3ron.h3ron.src.error.rs and the to_h3
function creates a HashMap type in let mut h3_map = HashMap::default();
imported from h3ron.h3ron.src.collections.mod.rs, see this line.
But I don't know how the error is thrown or where I went off path. I am new to Rust and would be glad for pointers on how to resolve this :)
It does not happen with every GeoTIF, but crashes on Sentinel satellite data. I set up a Jupyter notebook for others to reproduce my error: https://github.com/vanyabrucker/h3ronpy-issue-21
Thanks!
I executed the following command on a dataset I'm working with and saw the following error. Could anyone provide some guidance on what might be the issue here? Any pointers would be greatly appreciated!
ga_cells_df = geodataframe_to_cells(ga_gdf, 10)
---------------------------------------------------------------------------
ArrowIndexError Traceback (most recent call last)
File <timed exec>:1
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/h3ronpy/pandas/vector.py:124, in geodataframe_to_cells(gdf, resolution, containment_mode, compact, cell_column_name, all_intersecting)
116 cells = _av.wkb_to_cells(
117 gdf.geometry.to_wkb(),
118 resolution,
(...)
121 all_intersecting=all_intersecting,
122 )
123 table = pa.Table.from_pandas(pd.DataFrame(gdf.drop(columns="geometry"))).append_column(cell_column_name, cells)
--> 124 return _arrow_util.explode_table_include_null(table, cell_column_name).to_pandas().reset_index(drop=True)
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/h3ronpy/arrow/util.py:10, in explode_table_include_null(table, column)
8 other_columns.remove(column)
9 indices = pc.list_parent_indices(pc.fill_null(table[column], [None]))
---> 10 result = table.select(other_columns).take(indices)
11 result = result.append_column(
12 pa.field(column, table.schema.field(column).type.value_type),
13 pc.list_flatten(pc.fill_null(table[column], [None])),
14 )
15 return result
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/pyarrow/table.pxi:2005, in pyarrow.lib._Tabular.take()
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/pyarrow/compute.py:486, in take(data, indices, boundscheck, memory_pool)
446 """
447 Select values (or records) from array- or table-like data given integer
448 selection indices.
(...)
483 ]
484 """
485 options = TakeOptions(boundscheck=boundscheck)
--> 486 return call_function('take', [data, indices], options, memory_pool)
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/pyarrow/_compute.pyx:590, in pyarrow._compute.call_function()
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/pyarrow/_compute.pyx:385, in pyarrow._compute.Function.call()
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/pyarrow/error.pxi:154, in pyarrow.lib.pyarrow_internal_check_status()
File ~/opt/miniconda3/envs/viz-prototyping/lib/python3.11/site-packages/pyarrow/error.pxi:91, in pyarrow.lib.check_status()
ArrowIndexError: Negative buffer slice length
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.