GithubHelp home page GithubHelp logo

holoviz / spatialpandas Goto Github PK

View Code? Open in Web Editor NEW
305.0 23.0 24.0 2.01 MB

Pandas extension arrays for spatial/geometric operations

License: BSD 2-Clause "Simplified" License

Python 100.00%
holoviz spatialpandas pandas geopandas geographic-data

spatialpandas's Introduction



Pandas and Dask extensions for vectorized spatial and geometric operations.

Build Status pytest
Latest dev release Github tag
Latest release Github release PyPI version spatialpandas version conda-forge version defaults version
Python Python support
Support Discourse

Spatialpandas provides Pandas and Dask extensions for vectorized spatial and geometric operations, such as fast, spatially indexed rendering of large collections of polygons, lines, or points. For more information, see the overview notebook and the design document.

spatialpandas's People

Contributors

brl0 avatar gsakkis avatar hoxbro avatar iameskild avatar ianthomas23 avatar jbednar avatar jonmmease avatar jrbourbeau avatar maximlt avatar philipc2 avatar philippjfr avatar weiji14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spatialpandas's Issues

Set up code coverage

Set up code coverage in a similar way to other HoloViz projects. Would like to know how much of the source code is touched by the tests.

SpatialPandas design and features

spatialpandas

Pandas and Dask extensions for vectorized spatial and geometric operations.

This proposal is a plan towards extracting the functionality of the spatial/geometric utilities developed in Datashader into this separate, general-purpose library. Some of the functionality here has now (as of mid-2020) been implemented as marked below, but much remains to be done!

Goals

This project has several goals:

  1. Provide a set of pandas ExtensionArrays for storing columns of discrete geometric objects (Points, Polygons, etc.). Unlike the GeoPandas GeoSeries, there will be a separate extension array for each geometry type (PointsArray, PolygonsArray, etc.), and the underlying representation for the entire array will be stored in a single contiguous ragged array that is suitable for fast processing using numba.
  2. (partially done; see below) Provide a collection of vectorized geometric operations on these extension arrays, including most of the same operations that are included in shapely/geopandas, but implemented in numba rather than relying on the GEOS and libspatialindex C++ libraries. These vectorized Numba implementations will support CPU thread-based parallelization, and will also provide the foundation for future GPU support.
  3. Provide Dask DataFrame extensions for spatially partitioning DataFrames containing geometry columns. Also provide Dask DataFrame/Series implementations of the same geometric operations, but that also take advantage of spatial partitioning. This would effectively replace the Datashader SpatialPointsFrame and would support all geometry types, not only points.
  4. Provide round-trip serialization of pandas/dask data structures containing geometry columns to/from parquet. Writing a Dask DataFrame to a partitioned parquet file would optionally include Hilbert R-tree packing to optimize the partitions for later spatial access on read.
  5. Fast import/export to/from shapely and geopandas. This may rely on the pygeos library to interface directly with the GEOS objects, rather than calling shapely methods.

These features will make it very efficient for Datashader to process large geometry collections. They will also make it more efficient for HoloViews to perform linked selection on large spatially distributed datasets.

Non-goals

spatialpandas will be focused on geometry only, not geography. As such:

  • No built-in support for loading data from geography-specific file formats
  • No dependency on GDAL/fiona
  • No coordinate reference frame logic

Features

Extension Arrays

The spatialpandas.geometry package will contain geometry classes and pandas extension arrays for each geometry type

  • Point/PointArray: single point / array of points, one point per element. Maps to shapely Point class.
  • MultiPoint/MultiPointArray: multiple points / array of points, multiple points per element. Maps to shapely Points class.
  • MultiLines/MultiLinesArray: One or more lines / array of lines, multiple lines per element. Maps to shapely LineString and MultiLineString classes.
  • Ring/RingArray: Single closed ring / array of rings, one per element. Maps to shapely LinearRing class.
  • Polygon/PolygonArray: One or more polygons, each with zero or more holes / array of polygons with holes, multiple per element. Maps to shapely Polygon/MultiPolygon classes.

Spatial Index

  • The spatialpandas.spatialindex module will contain a vectorized and parallel numba implementation of a Hilbert-RTree.

  • Each extension array has an sindex property that holds a lazily generated spatial index.

Extension Array Geometric Operations

The extension arrays above will have methods/properties for shapely-compatible geometric operations. These are implemented as parallel vectorized numba functions. Supportable operations include:

  • area
  • length
  • bounds
  • boundary
  • buffer
  • centroid
  • convex_hull
  • covers
  • contains
  • crosses
  • difference
  • disjoint
  • distance
  • envelope
  • exterior
  • hausdorff_distance
  • interpolate
  • intersection
  • intersects_bounds
  • minimum_rotated_rectangle
  • overlaps
  • project
  • simplify
  • union
  • unary_union
  • affine_transform
  • rotate
  • scale
  • skew
  • translate
  • sjoin (spatial join) (see tools/sjoin) (partial, see #21)

Only a minimal subset of these will be implemented by the core developers, but others can be added relatively easily by users and other contributors by starting with one of the implemented methods as an example, then adding code from other published libraries (but Numba-ized and Dask-ified if possible!).

Pandas accessors

Custom pandas Series accessor is included to expose these geometric operations at the Series level. E.g.

  • df.column.spatial.area returns a pandas Series with the same index as df, containing area values.
  • df.column.spatial.cx is a geopandas-style spatial indexer for filtering a series spatially using a spatial index.

Custom pandas DataFrame accessor is included to track the current "active" geometry column for a DataFrame, and provide DataFrame level operations. E.g.

  • df.spatial.cx will filter a dataframe spatially based on the current active geometry column.

Dask accessors

A custom Dask Series accessor is included to expose geometric operations on a Dask geometry Series. E.g.

  • ddf.column.spatial.area returns a dask Series with the same index as ddf, containing area values.

  • The accessor also holds a spatial index of bounding box of the shapes in each partition. This allows spatial operations (e.g. cx, spatial join) to skip processing entire partitions that will not contain relevant geometries.

  • A Custom dask DataFrame accessor is included that is exactly analogous to the pandas version.

Conversions

  • Fast conversions to and from geopandas will be provided, potentially relying on pygeos.

Parquet support

  • read/to parquet for Pandas DataFrames will be able to rely on the standard pandas parquet functions with extension array support.

  • Special read/to parquet functions will be provided for Dask DataFrames.

  • to_parquet will add extra metadata to the parquet file to specify the geometric bounds of each partition. There will also be the option for to_parquet to use Hilbert R-tree packing to optimize the partitions for later spatial access on read.

  • read_parquet will read this partition bounding-box metadata and use it to pre-populate the spatial accessor for each geometry column with a partition-level spatial index.

Compatibility

  • We would aim to use consistent naming between geopandas and spatialpandas whenever possible. Since spatialpandas will rely on a DataFrame accessor rather than a DataFrame subclass, spatial properties and methods are under the accessor rather than on the DataFrame.

For example, geodf.cx becomes spdf.spatial.cx.

  • Eventually, a subclass compatibility layer could likely be added that would simply dispatch certain class methods/properties to the spatial accessor.

Testing

  • The test suite will rely heavily on property testing (potentially with hypothesis) to compare the spatialpandas results to geopandas/shapely results.

error running Spatialpandas hvplot example

I had some issues running the exmaple using spatialpandas and hvplot to plot million geometries. These are the codes that i am running;

spd_world = spd.GeoDataFrame(world)
spd_world.hvplot(datashade=True, project=True, aggregator=ds.count_cat('continent'), color_key='Category10')

and the error i got is:

WARNING:param.dynamic_operation: Callable raised "TypeError("Cannot interpret 'MultiPolygonDtype(float64)' as a data type")".
TypeError: Cannot interpret 'MultiPolygonDtype(float64)' as a data type

my packages are:
geopandas 0.8.1
datashader 0.12.0
holoviews 1.14.1
spatialpandas 0.3.6
hvplot 0.7.0

I am not sure how to resolve this issue. any suggestions please?

Support datashader.spatial.points.to_parquet with pyarrow

PR holoviz/datashader#702 introduced support for spatially indexing Dask dataframes and writing them out as parquet files with custom spatial metadata using the datashader.spatial.points.to_parquet.

To accomplish this, the parquet file is initially written out using dask's dask.dataframe.io.to_parquet function. Then the parquet file is opened with fastparquet directly. The parquet metadata is retrieved using fastparquet, the spatial metadata is added, and then the updated metadata is written back to the file.

In order to support the creation of spatially partitioned parquet files using pyarrow (rather than fastparquet), we would need to work out an similar approach to adding properties to the parquet metadata using the pyarrow parquet API.

Warning: Creating an ndarray from ragged nested sequences is deprecated

This warning occurs numerous times during testing.

tests/test_fixedextensionarray.py: 191 warnings
tests/test_geodataframe.py: 1 warning
tests/geometry/test_to_geopandas.py: 2 warnings
/home/travis/miniconda/envs/test-environment/lib/python3.7/site-packages/numpy/core/_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
return array(a, dtype, copy=False, order=order)

This occurs with the latest version of spatialpandas, and numpy 1.19.1.

Faster construction of `PolygonArray` and `MultiPolygonArray`

I am working with a relatively large unstructured grid (57,000,000 polygons) that I am aiming to visualize. I am storing my geometries in a spatialpandas.GeoDataFrame with a single MultiPolygonArray column. However, the time that it takes me to construct these lead to about a 6-7x slowdown over using a geopandas.GeoDataFrame with the same data stored.

Code / Algorithm

import uxarray as ux
import numpy as np
import spatialpandas
import geopandas 
import antimeridian
from shapely import polygons as Polygons

def to_gdf(uxgrid):

    # construct an array of shapely polygons
    polygons = Polygons(uxgrid.polygon_shells)
    
    # indices of polygons that need to be corrected for the antimeridian case
    antimeridian_indices = np.argwhere(np.any(np.abs(np.diff(uxgrid.polygon_shells[:, :, 0])) >= 180, axis=1)).squeeze()
    
    # obtain antimeridian polygons
    antimeridian_polygons = polygons[antimeridian_indices]
    
    # correct antimeridian polygons, stored as MultiPolygons
    corrected_polygons = [antimeridian.fix_polygon(P) for P in antimeridian_polygons]
    
    # replace original polygon with the corrected version
    for i in reversed(antimeridian_indices):
        polygons[i] = corrected_polygons.pop()
    
    ## ---- spatialpandas
    geometry = MultiPolygonArray(polygons)
    gdf = spatialpandas.GeoDataFrame({"geometry": geometry})  

    ## ---- geopandas
    # gdf = geopandas.GeoDataFrame({"geometry": polygons})  
    
    return gdf

Timings

Method / # of Polygons 1 ,791 7,153 28,57 236,853 56,623,106
SpatialPandas 0.117s 0.401s 1.61s 12.4s 2851s
GeoPandas 0.043s 0.111s 0.553s 2.53s 434s

Taken from a single NCAR Casper Node with 256gb of memory (dataset was able to fit into memory each time, no dask used)

Question

My key question is whether there's a suggested or faster way of constructing PolygonArray and MultiPolygonArray or better ways of handling such large datasets. Thanks!

error when importing DaskGeoDataFrame

Due to the Dask release v2021.05.1, PR dask/dask#7503, the DaskGeoDataFrame importing was broken.

Screenshot from 2021-06-01 10-15-57

Only the make_meta and meta_nonempty functions were moved to the dispatch module, make_array_nonempty were not being imported directly from the source module.

Probably the previous spatialpandas releases will have the same problem.

dask `2021.8.0` breaks parquet io

Opening issue here for visibility and to track any potential updates.

ALL software version info

(this library, plus any other relevant software, e.g. bokeh, python, notebook, OS, browser, etc)

python 
3.8.10

spatialpandas
0.4.3

dask
2021.8.0

Description of expected behavior and the observed behavior

Reading and writing to parquet is producing errors when used with the latest version of dask 2021.8.0. I have tested reverting back to dask 2021.7.2 and do not experience any of the issues outlined below.

The dask team may already be aware of this issue:

Complete, minimal, self-contained example code that reproduces the issue

import dask
import spatialpandas as spd
from spatialpandas.io import read_parquet_dask

path = "s3://bucketname/data.parquet"
storage_options = {"key": <key>, "secret": <secret>}

ddf = read_parquet_dask(path, storage_options=storage_options)

The code block above and ddf.pack_partitions_to_parquet both produce the following error:

Traceback (most recent call last):
  File "/app/datum/readers/quadrant.py", line 106, in sort_quadrant_files
    ddf_packed = ddf.pack_partitions_to_parquet(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/spatialpandas/dask.py", line 530, in pack_partitions_to_parquet
    return read_parquet_dask(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/spatialpandas/io/parquet.py", line 321, in read_parquet_dask
    result = _perform_read_parquet_dask(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/spatialpandas/io/parquet.py", line 421, in _perform_read_parquet_dask
    meta = dd_read_parquet(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/dask/dataframe/io/parquet/core.py", line 313, in read_parquet
    read_metadata_result = engine.read_metadata(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/dask/dataframe/io/parquet/arrow.py", line 537, in read_metadata
    ) = cls._gather_metadata(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/dask/dataframe/io/parquet/arrow.py", line 1035, in _gather_metadata
    ds = pa_ds.parquet_dataset(
  File "/opt/conda/envs/datum/lib/python3.8/site-packages/pyarrow/dataset.py", line 457, in parquet_dataset
    factory = ParquetDatasetFactory(
  File "pyarrow/_dataset.pyx", line 2043, in pyarrow._dataset.ParquetDatasetFactory.__init__
  File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Extracting file path from RowGroup failed. The column chunks' file paths should be set, but got an empty file path.

`RecursionError` After Pandas 2.1.0 Release

Description of expected behavior and the observed behavior

After the Pandas 2.1.0 release, construction of GeoDataFrames runs into a RecursionError

(spatialpandas-pandas-210) bash-4.2$ conda list # packages in environment at /glade/work/philipc/conda-envs/spatialpandas-pandas-210: # # Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge

_openmp_mutex 4.5 2_gnu conda-forge

asttokens 2.2.1 pyhd8ed1ab_0 conda-forge

aws-c-auth 0.7.3 he2921ad_3 conda-forge

aws-c-cal 0.6.2 hc309b26_0 conda-forge

aws-c-common 0.9.0 hd590300_0 conda-forge

aws-c-compression 0.2.17 h4d4d85c_2 conda-forge

aws-c-event-stream 0.3.2 h2e3709c_0 conda-forge

aws-c-http 0.7.12 hc865f51_1 conda-forge

aws-c-io 0.13.32 h019f825_2 conda-forge

aws-c-mqtt 0.9.5 h3a0376c_1 conda-forge

aws-c-s3 0.3.14 h1678ad6_3 conda-forge

aws-c-sdkutils 0.1.12 h4d4d85c_1 conda-forge

aws-checksums 0.1.17 h4d4d85c_1 conda-forge

aws-crt-cpp 0.23.0 h40cdbb9_5 conda-forge

aws-sdk-cpp 1.10.57 h6f6b8fa_21 conda-forge

backcall 0.2.0 pyh9f0ad1d_0 conda-forge

backports 1.0 pyhd8ed1ab_3 conda-forge

backports.functools_lru_cache 1.6.5 pyhd8ed1ab_0 conda-forge

bokeh 3.2.2 pyhd8ed1ab_0 conda-forge

brotli-python 1.0.9 py311ha362b79_9 conda-forge

bzip2 1.0.8 h7f98852_4 conda-forge

c-ares 1.19.1 hd590300_0 conda-forge

ca-certificates 2023.7.22 hbcca054_0 conda-forge

click 8.1.7 unix_pyh707e725_0 conda-forge

cloudpickle 2.2.1 pyhd8ed1ab_0 conda-forge

comm 0.1.4 pyhd8ed1ab_0 conda-forge

contourpy 1.1.0 py311h9547e67_0 conda-forge

cytoolz 0.12.2 py311h459d7ec_0 conda-forge

dask 2023.8.1 pyhd8ed1ab_0 conda-forge

dask-core 2023.8.1 pyhd8ed1ab_0 conda-forge

debugpy 1.6.8 py311hb755f60_0 conda-forge

decorator 5.1.1 pyhd8ed1ab_0 conda-forge

distributed 2023.8.1 pyhd8ed1ab_0 conda-forge

executing 1.2.0 pyhd8ed1ab_0 conda-forge

freetype 2.12.1 hca18f0e_1 conda-forge

fsspec 2023.6.0 pyh1a96a4e_0 conda-forge

gflags 2.2.2 he1b5a44_1004 conda-forge

glog 0.6.0 h6f12383_0 conda-forge

importlib-metadata 6.8.0 pyha770c72_0 conda-forge

importlib_metadata 6.8.0 hd8ed1ab_0 conda-forge

ipykernel 6.25.1 pyh71e2992_0 conda-forge

ipython 8.14.0 pyh41d4057_0 conda-forge

jedi 0.19.0 pyhd8ed1ab_0 conda-forge

jinja2 3.1.2 pyhd8ed1ab_1 conda-forge

jupyter_client 8.3.1 pyhd8ed1ab_0 conda-forge

jupyter_core 5.3.1 py311h38be061_0 conda-forge

keyutils 1.6.1 h166bdaf_0 conda-forge

krb5 1.21.2 h659d440_0 conda-forge

lcms2 2.15 haa2dc70_1 conda-forge

ld_impl_linux-64 2.40 h41732ed_0 conda-forge

lerc 4.0.0 h27087fc_0 conda-forge

libabseil 20230125.3 cxx17_h59595ed_0 conda-forge

libarrow 13.0.0 hb9dc469_0_cpu conda-forge

libblas 3.9.0 17_linux64_openblas conda-forge

libbrotlicommon 1.0.9 h166bdaf_9 conda-forge

libbrotlidec 1.0.9 h166bdaf_9 conda-forge

libbrotlienc 1.0.9 h166bdaf_9 conda-forge

libcblas 3.9.0 17_linux64_openblas conda-forge

libcrc32c 1.1.2 h9c3ff4c_0 conda-forge

libcurl 8.2.1 hca28451_0 conda-forge

libdeflate 1.18 h0b41bf4_0 conda-forge

libedit 3.1.20191231 he28a2e2_2 conda-forge

libev 4.33 h516909a_1 conda-forge

libevent 2.1.12 hf998b51_1 conda-forge

libexpat 2.5.0 hcb278e6_1 conda-forge

libffi 3.4.2 h7f98852_5 conda-forge

libgcc-ng 13.1.0 he5830b7_0 conda-forge

libgfortran-ng 13.1.0 h69a702a_0 conda-forge

libgfortran5 13.1.0 h15d22d2_0 conda-forge

libgomp 13.1.0 he5830b7_0 conda-forge

libgoogle-cloud 2.12.0 h840a212_1 conda-forge

libgrpc 1.56.2 h3905398_1 conda-forge

libjpeg-turbo 2.1.5.1 h0b41bf4_0 conda-forge

liblapack 3.9.0 17_linux64_openblas conda-forge

libllvm14 14.0.6 hcd5def8_4 conda-forge

libnghttp2 1.52.0 h61bc06f_0 conda-forge

libnsl 2.0.0 h7f98852_0 conda-forge

libnuma 2.0.16 h0b41bf4_1 conda-forge

libopenblas 0.3.23 pthreads_h80387f5_0 conda-forge

libpng 1.6.39 h753d276_0 conda-forge

libprotobuf 4.23.3 hd1fb520_0 conda-forge

libsodium 1.0.18 h36c2ea0_1 conda-forge

libsqlite 3.43.0 h2797004_0 conda-forge

libssh2 1.11.0 h0841786_0 conda-forge

libstdcxx-ng 13.1.0 hfd8a6a1_0 conda-forge

libthrift 0.18.1 h8fd135c_2 conda-forge

libtiff 4.5.1 h8b53f26_1 conda-forge

libutf8proc 2.8.0 h166bdaf_0 conda-forge

libuuid 2.38.1 h0b41bf4_0 conda-forge

libwebp-base 1.3.1 hd590300_0 conda-forge

libxcb 1.15 h0b41bf4_0 conda-forge

libzlib 1.2.13 hd590300_5 conda-forge

llvmlite 0.40.1 py311ha6695c7_0 conda-forge

locket 1.0.0 pyhd8ed1ab_0 conda-forge

lz4 4.3.2 py311h9f220a4_0 conda-forge

lz4-c 1.9.4 hcb278e6_0 conda-forge

markupsafe 2.1.3 py311h459d7ec_0 conda-forge

matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge

msgpack-python 1.0.5 py311ha3edf6b_0 conda-forge

nb_conda_kernels 2.3.1 py311h38be061_2 conda-forge

ncurses 6.4 hcb278e6_0 conda-forge

nest-asyncio 1.5.6 pyhd8ed1ab_0 conda-forge

numba 0.57.1 py311h96b013e_0 conda-forge

numpy 1.24.4 py311h64a7726_0 conda-forge

openjpeg 2.5.0 hfec8fc6_2 conda-forge

openssl 3.1.2 hd590300_0 conda-forge

orc 1.9.0 h385abfd_1 conda-forge

packaging 23.1 pyhd8ed1ab_0 conda-forge

pandas 2.1.0 py311h320fe9a_0 conda-forge

param 1.13.0 py_0 pyviz
parso 0.8.3 pyhd8ed1ab_0 conda-forge

partd 1.4.0 pyhd8ed1ab_0 conda-forge

pexpect 4.8.0 pyh1a96a4e_2 conda-forge

pickleshare 0.7.5 py_1003 conda-forge

pillow 10.0.0 py311h0b84326_0 conda-forge

pip 23.2.1 pyhd8ed1ab_0 conda-forge

platformdirs 3.10.0 pyhd8ed1ab_0 conda-forge

prompt-toolkit 3.0.39 pyha770c72_0 conda-forge

prompt_toolkit 3.0.39 hd8ed1ab_0 conda-forge

psutil 5.9.5 py311h2582759_0 conda-forge

pthread-stubs 0.4 h36c2ea0_1001 conda-forge

ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge

pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge

pyarrow 13.0.0 py311h39c9aba_0_cpu conda-forge

pygments 2.16.1 pyhd8ed1ab_0 conda-forge

pysocks 1.7.1 pyha2e5f31_6 conda-forge

python 3.11.5 hab00c5b_0_cpython conda-forge

python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge

python-tzdata 2023.3 pyhd8ed1ab_0 conda-forge

python_abi 3.11 3_cp311 conda-forge

pytz 2023.3 pyhd8ed1ab_0 conda-forge

pyyaml 6.0.1 py311h459d7ec_0 conda-forge

pyzmq 25.1.1 py311h75c88c4_0 conda-forge

rdma-core 28.9 h59595ed_1 conda-forge

re2 2023.03.02 h8c504da_0 conda-forge

readline 8.2 h8228510_1 conda-forge

retrying 1.3.3 py_2 conda-forge

s2n 1.3.49 h06160fa_0 conda-forge

setuptools 68.1.2 pyhd8ed1ab_0 conda-forge

six 1.16.0 pyh6c4a22f_0 conda-forge

snappy 1.1.10 h9fff704_0 conda-forge

sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge

spatialpandas 0.4.8 py_0 pyviz

stack_data 0.6.2 pyhd8ed1ab_0 conda-forge

tblib 1.7.0 pyhd8ed1ab_0 conda-forge

tk 8.6.12 h27826a3_0 conda-forge

toolz 0.12.0 pyhd8ed1ab_0 conda-forge

tornado 6.3.3 py311h459d7ec_0 conda-forge

traitlets 5.9.0 pyhd8ed1ab_0 conda-forge

typing-extensions 4.7.1 hd8ed1ab_0 conda-forge

typing_extensions 4.7.1 pyha770c72_0 conda-forge

tzdata 2023c h71feb2d_0 conda-forge

ucx 1.14.1 h4a2ce2d_3 conda-forge

urllib3 2.0.4 pyhd8ed1ab_0 conda-forge

wcwidth 0.2.6 pyhd8ed1ab_0 conda-forge

wheel 0.41.2 pyhd8ed1ab_0 conda-forge

xorg-libxau 1.0.11 hd590300_0 conda-forge

xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge

xyzservices 2023.7.0 pyhd8ed1ab_0 conda-forge

xz 5.2.6 h166bdaf_0 conda-forge

yaml 0.2.5 h7f98852_2 conda-forge

zeromq 4.3.4 h9c3ff4c_1 conda-forge

zict 3.0.0 pyhd8ed1ab_0 conda-forge

zipp 3.16.2 pyhd8ed1ab_0 conda-forge

zstd 1.5.5 hfc55251_0 conda-forge

Complete, minimal, self-contained example code that reproduces the issue

from spatialpandas import GeoDataFrame
from spatialpandas.geometry import PolygonArray

# Square from (0, 0) to (1, 1) in CCW order
outline0 = [0, 0, 1, 0, 1, 1, 0, 1, 0, 0]

# Square from (2, 2) to (5, 5) in CCW order
outline1 = [2, 2, 5, 2, 5, 5, 2, 5, 2, 2]

# Triangle hole in CW order
hole1 = [3, 3, 4, 3, 3, 4, 3, 3]

polygon_array = PolygonArray([
    [outline0],
    [outline1, hole1]
])

GeoDataFrame({"geometry": polygon_array})

Stack traceback and/or browser JavaScript console output

GeoDataFrame({"geometry": polygon_array}) GeoDataFrame({"geometry": polygon_array}) --------------------------------------------------------------------------- RecursionError Traceback (most recent call last) Cell In[8], line 1 ----> 1 GeoDataFrame({"geometry": polygon_array})

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geodataframe.py:31, in GeoDataFrame.init(self, data, index, geometry, **kwargs)
28 for col in self.columns:
29 if (isinstance(self[col].dtype, GeometryDtype) or
30 gp and isinstance(self[col].dtype, gp.array.GeometryDtype)):
---> 31 self[col] = GeoSeries(self[col])
32 first_geometry_col = first_geometry_col or col
34 if first_geometry_col is None:

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:35, in GeoSeries.init(self, data, index, name, dtype, **kwargs)
32 dtype = pd.array([], dtype=dtype).dtype
34 data = to_geometry_array(data, dtype)
---> 35 super().init(data, index=index, name=name, **kwargs)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:471, in Series.init(self, data, index, dtype, name, copy, fastpath)
469 data = data._mgr.copy(deep=False)
470 else:
--> 471 data = data.reindex(index, copy=copy)
472 copy = False
473 data = data._mgr

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:4982, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance)
4965 @doc(
4966 NDFrame.reindex, # type: ignore[has-type]
4967 klass=_shared_doc_kwargs["klass"],
(...)
4980 tolerance=None,
4981 ) -> Series:
-> 4982 return super().reindex(
4983 index=index,
4984 method=method,
4985 copy=copy,
4986 level=level,
4987 fill_value=fill_value,
4988 limit=limit,
4989 tolerance=tolerance,
4990 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:5514, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
5508 copy = False
5509 if all(
5510 self._get_axis(axis_name).identical(ax)
5511 for axis_name, ax in axes.items()
5512 if ax is not None
5513 ):
-> 5514 return self.copy(deep=copy)
5516 # check if we are a multi reindex
5517 if self._needs_reindex_multi(axes, method, level):

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6685, in NDFrame.copy(self, deep)
6683 data = self._mgr.copy(deep=deep)
6684 self._clear_item_cache()
-> 6685 return self._constructor_from_mgr(data, axes=data.axes).finalize(
6686 self, method="copy"
6687 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:589, in Series._constructor_from_mgr(self, mgr, axes)
587 else:
588 assert axes is mgr.axes
--> 589 return self._constructor(ser, copy=False)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:12, in _MaybeGeoSeries.new(cls, data, *args, **kwargs)
10 else:
11 series_cls = pd.Series
---> 12 return series_cls(data, *args, **kwargs)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:35, in GeoSeries.init(self, data, index, name, dtype, **kwargs)
32 dtype = pd.array([], dtype=dtype).dtype
34 data = to_geometry_array(data, dtype)
---> 35 super().init(data, index=index, name=name, **kwargs)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:471, in Series.init(self, data, index, dtype, name, copy, fastpath)
469 data = data._mgr.copy(deep=False)
470 else:
--> 471 data = data.reindex(index, copy=copy)
472 copy = False
473 data = data._mgr

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:4982, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance)
4965 @doc(
4966 NDFrame.reindex, # type: ignore[has-type]
4967 klass=_shared_doc_kwargs["klass"],
(...)
4980 tolerance=None,
4981 ) -> Series:
-> 4982 return super().reindex(
4983 index=index,
4984 method=method,
4985 copy=copy,
4986 level=level,
4987 fill_value=fill_value,
4988 limit=limit,
4989 tolerance=tolerance,
4990 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:5514, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
5508 copy = False
5509 if all(
5510 self._get_axis(axis_name).identical(ax)
5511 for axis_name, ax in axes.items()
5512 if ax is not None
5513 ):
-> 5514 return self.copy(deep=copy)
5516 # check if we are a multi reindex
5517 if self._needs_reindex_multi(axes, method, level):

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6685, in NDFrame.copy(self, deep)
6683 data = self._mgr.copy(deep=deep)
6684 self._clear_item_cache()
-> 6685 return self._constructor_from_mgr(data, axes=data.axes).finalize(
6686 self, method="copy"
6687 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:589, in Series._constructor_from_mgr(self, mgr, axes)
587 else:
588 assert axes is mgr.axes
--> 589 return self._constructor(ser, copy=False)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:12, in _MaybeGeoSeries.new(cls, data, *args, **kwargs)
10 else:
11 series_cls = pd.Series
---> 12 return series_cls(data, *args, **kwargs)

[... skipping similar frames: GeoSeries.__init__ at line 35 (327 times), Series.__init__ at line 471 (327 times), Series.reindex at line 4982 (327 times), NDFrame.reindex at line 5514 (327 times), _MaybeGeoSeries.__new__ at line 12 (326 times), Series._constructor_from_mgr at line 589 (326 times), NDFrame.copy at line 6685 (326 times)]

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6685, in NDFrame.copy(self, deep)
6683 data = self._mgr.copy(deep=deep)
6684 self._clear_item_cache()
-> 6685 return self._constructor_from_mgr(data, axes=data.axes).finalize(
6686 self, method="copy"
6687 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:589, in Series._constructor_from_mgr(self, mgr, axes)
587 else:
588 assert axes is mgr.axes
--> 589 return self._constructor(ser, copy=False)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:12, in _MaybeGeoSeries.new(cls, data, *args, **kwargs)
10 else:
11 series_cls = pd.Series
---> 12 return series_cls(data, *args, **kwargs)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:35, in GeoSeries.init(self, data, index, name, dtype, **kwargs)
32 dtype = pd.array([], dtype=dtype).dtype
34 data = to_geometry_array(data, dtype)
---> 35 super().init(data, index=index, name=name, **kwargs)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:471, in Series.init(self, data, index, dtype, name, copy, fastpath)
469 data = data._mgr.copy(deep=False)
470 else:
--> 471 data = data.reindex(index, copy=copy)
472 copy = False
473 data = data._mgr

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:4982, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance)
4965 @doc(
4966 NDFrame.reindex, # type: ignore[has-type]
4967 klass=_shared_doc_kwargs["klass"],
(...)
4980 tolerance=None,
4981 ) -> Series:
-> 4982 return super().reindex(
4983 index=index,
4984 method=method,
4985 copy=copy,
4986 level=level,
4987 fill_value=fill_value,
4988 limit=limit,
4989 tolerance=tolerance,
4990 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:5514, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
5508 copy = False
5509 if all(
5510 self._get_axis(axis_name).identical(ax)
5511 for axis_name, ax in axes.items()
5512 if ax is not None
5513 ):
-> 5514 return self.copy(deep=copy)
5516 # check if we are a multi reindex
5517 if self._needs_reindex_multi(axes, method, level):

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6683, in NDFrame.copy(self, deep)
6551 @Final
6552 def copy(self, deep: bool_t | None = True) -> Self:
6553 """
6554 Make a copy of this object's indices and data.
6555
(...)
6681 dtype: int64
6682 """
-> 6683 data = self._mgr.copy(deep=deep)
6684 self._clear_item_cache()
6685 return self._constructor_from_mgr(data, axes=data.axes).finalize(
6686 self, method="copy"
6687 )

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/internals/managers.py:576, in BaseBlockManager.copy(self, deep)
573 else:
574 new_axes = list(self.axes)
--> 576 res = self.apply("copy", deep=deep)
577 res.axes = new_axes
579 if self.ndim > 1:
580 # Avoid needing to re-compute these

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/internals/managers.py:354, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
352 applied = b.apply(f, **kwargs)
353 else:
--> 354 applied = getattr(b, f)(**kwargs)
355 result_blocks = extend_blocks(applied, result_blocks)
357 out = type(self).from_blocks(result_blocks, self.axes)

File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/internals/blocks.py:649, in Block.copy(self, deep)
647 else:
648 refs = self.refs
--> 649 return type(self)(values, placement=self._mgr_locs, ndim=self.ndim, refs=refs)

File internals.pyx:680, in pandas._libs.internals.SharedBlock.cinit()

File internals.pyx:962, in pandas._libs.internals.BlockValuesRefs.add_reference()

RecursionError: maximum recursion depth exceeded while calling a Python object

current version number

@jbednar,

Thanks for merging PR #37.

@jonmmease has previously released v0.3.5 (about 5 months ago, shortly after releasing v0.3.4, which was released on 2/21), even though it does not appear to have been tagged, but this version appears on the pyviz conda channel and on conda-forge. The latest alpha release should be version v0.3.6a.

I am not sure what needs to happen to get this latest release on the pyviz conda channel, but if there is anything I need to do, please let me know.

Support numpy 1.24

Prior to numpy 1.24 creating an array from ragged nested sequences produced a VisibleDeprecationWarning. With 1.24 this is now a ValueError. This is OK currently as numba doesn't yet support numpy 1.24 but it needs to be fixed here before that happens, so it is quite urgent.

Thanks to @hoxbro for identifying this (holoviz/geoviews#608).

Inlining hilbertcurve dependency or getting it on conda

The spatial indexing in spatialpandas is using the hilbertcurve package. The implementation is a tiny package of about ~150 lines of code, which is currently MIT licensed. Since this is a well established implementation of a pretty standard algorithm I don't think we will gain a whole lot from depending directly on this package since there are unlikely to be bug fixes in the actual code. At the same time as long as we get the dependency on conda once we also don't need to worry much about having to wait for a release. On balance my vote is to simply inline the code and potentially optimize it with numba later if that's worthwhile. In the short term this will simplify my short term pain in setting up the build infrastructure. Overall I also think the conda ecosystem suffers from tiny packages so I'm not entirely enthusiastic about adding it conda-forge and potentially defaults (eventually).

Cc: @jonmmease @jbednar

Set python-snappy as optional dependency to work with Python 3.11 on pip install

Is your feature request related to a problem? Please describe.

Trying to install spatialpandas in a Python 3.11 environment currently fails due to a hard dependency on python-snappy which doesn't have wheels for Python 3.11 (see intake/python-snappy#124).

mamba create --name temp python=3.11
mamba activate temp
python -m pip install spatialpandas==0.4.7

produces this traceback

Collecting spatialpandas==0.4.7
  Using cached spatialpandas-0.4.7-py2.py3-none-any.whl (120 kB)
Collecting dask (from spatialpandas==0.4.7)
  Using cached dask-2023.5.0-py3-none-any.whl (1.2 MB)
Collecting fsspec (from spatialpandas==0.4.7)
  Using cached fsspec-2023.5.0-py3-none-any.whl (160 kB)
Collecting numba (from spatialpandas==0.4.7)
  Downloading numba-0.57.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.6 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 3.6/3.6 MB 22.8 MB/s eta 0:00:00
Collecting pandas (from spatialpandas==0.4.7)
  Downloading pandas-2.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 12.2/12.2 MB 19.9 MB/s eta 0:00:00
Collecting param (from spatialpandas==0.4.7)
  Using cached param-1.13.0-py2.py3-none-any.whl (87 kB)
Collecting pyarrow>=1.0 (from spatialpandas==0.4.7)
  Downloading pyarrow-12.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.9 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 38.9/38.9 MB 3.2 MB/s eta 0:00:00
Collecting python-snappy (from spatialpandas==0.4.7)
  Downloading python-snappy-0.6.1.tar.gz (24 kB)
  Preparing metadata (setup.py) ... done
Collecting retrying (from spatialpandas==0.4.7)
  Using cached retrying-1.3.4-py3-none-any.whl (11 kB)
Collecting numpy>=1.16.6 (from pyarrow>=1.0->spatialpandas==0.4.7)
  Downloading numpy-1.24.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 17.3/17.3 MB 7.6 MB/s eta 0:00:00
Collecting click>=8.0 (from dask->spatialpandas==0.4.7)
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Collecting cloudpickle>=1.5.0 (from dask->spatialpandas==0.4.7)
  Using cached cloudpickle-2.2.1-py3-none-any.whl (25 kB)
Collecting packaging>=20.0 (from dask->spatialpandas==0.4.7)
  Using cached packaging-23.1-py3-none-any.whl (48 kB)
Collecting partd>=1.2.0 (from dask->spatialpandas==0.4.7)
  Using cached partd-1.4.0-py3-none-any.whl (18 kB)
Collecting pyyaml>=5.3.1 (from dask->spatialpandas==0.4.7)
  Downloading PyYAML-6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (757 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 757.9/757.9 kB 3.3 MB/s eta 0:00:00
Collecting toolz>=0.10.0 (from dask->spatialpandas==0.4.7)
  Using cached toolz-0.12.0-py3-none-any.whl (55 kB)
Collecting importlib-metadata>=4.13.0 (from dask->spatialpandas==0.4.7)
  Using cached importlib_metadata-6.6.0-py3-none-any.whl (22 kB)
Collecting llvmlite<0.41,>=0.40.0dev0 (from numba->spatialpandas==0.4.7)
  Downloading llvmlite-0.40.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 42.1/42.1 MB 9.5 MB/s eta 0:00:00
Collecting python-dateutil>=2.8.2 (from pandas->spatialpandas==0.4.7)
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pytz>=2020.1 (from pandas->spatialpandas==0.4.7)
  Using cached pytz-2023.3-py2.py3-none-any.whl (502 kB)
Collecting tzdata>=2022.1 (from pandas->spatialpandas==0.4.7)
  Using cached tzdata-2023.3-py2.py3-none-any.whl (341 kB)
Collecting six>=1.7.0 (from retrying->spatialpandas==0.4.7)
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting zipp>=0.5 (from importlib-metadata>=4.13.0->dask->spatialpandas==0.4.7)
  Using cached zipp-3.15.0-py3-none-any.whl (6.8 kB)
Collecting locket (from partd>=1.2.0->dask->spatialpandas==0.4.7)
  Using cached locket-1.0.0-py2.py3-none-any.whl (4.4 kB)
Building wheels for collected packages: python-snappy
  Building wheel for python-snappy (setup.py) ... error
  error: subprocess-exited-with-error
  
  ร— python setup.py bdist_wheel did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> [27 lines of output]
      /home/user/mambaforge/envs/temp/lib/python3.11/site-packages/setuptools/_distutils/dist.py:265: UserWarning: Unknown distribution option: 'cffi_modules'
        warnings.warn(msg)
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-311
      creating build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/__main__.py -> build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/snappy.py -> build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/hadoop_snappy.py -> build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/snappy_cffi_builder.py -> build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/snappy_cffi.py -> build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/snappy_formats.py -> build/lib.linux-x86_64-cpython-311/snappy
      copying src/snappy/__init__.py -> build/lib.linux-x86_64-cpython-311/snappy
      running build_ext
      building 'snappy._snappy' extension
      creating build/temp.linux-x86_64-cpython-311
      creating build/temp.linux-x86_64-cpython-311/src
      creating build/temp.linux-x86_64-cpython-311/src/snappy
      gcc -pthread -B /home/user/mambaforge/envs/temp/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/user/mambaforge/envs/temp/include -fPIC -O2 -isystem /home/user/mambaforge/envs/temp/include -fPIC -I/home/user/mambaforge/envs/temp/include/python3.11 -c src/snappy/crc32c.c -o build/temp.linux-x86_64-cpython-311/src/snappy/crc32c.o
      gcc -pthread -B /home/user/mambaforge/envs/temp/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/user/mambaforge/envs/temp/include -fPIC -O2 -isystem /home/user/mambaforge/envs/temp/include -fPIC -I/home/user/mambaforge/envs/temp/include/python3.11 -c src/snappy/snappymodule.cc -o build/temp.linux-x86_64-cpython-311/src/snappy/snappymodule.o
      src/snappy/snappymodule.cc:33:10: fatal error: snappy-c.h: No such file or directory
         33 | #include <snappy-c.h>
            |          ^~~~~~~~~~~~
      compilation terminated.
      error: command '/usr/bin/gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for python-snappy
  Running setup.py clean for python-snappy
Failed to build python-snappy
ERROR: Could not build wheels for python-snappy, which is required to install pyproject.toml-based projects

Describe the solution you'd like

A clear and concise description of what you want to happen.

Convert python-snappy from a required to an optional dependency in the setup.py file:

spatialpandas/setup.py

Lines 31 to 40 in bc3e52c

install_requires = [
'dask',
'fsspec',
'numba',
'pandas',
'param',
'pyarrow >=1.0',
'python-snappy',
'retrying',
]

Looking at the codebase, I only see snappy mentioned for the parquet I/O in two places:

compression="snappy",

compression: Optional[str] = "snappy",
filesystem: Optional[fsspec.AbstractFileSystem] = None,
index: Optional[bool] = None,
storage_options: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> None:
if filesystem is not None:
filesystem = validate_coerce_filesystem(path, filesystem, storage_options)
# Standard pandas to_parquet with pyarrow engine
to_parquet_args = {
"df": df,
"path": path,
"engine": "pyarrow",
"compression": compression,
"filesystem": filesystem,
"index": index,
**kwargs,
}
if PANDAS_GE_12:
to_parquet_args.update({"storage_options": storage_options})
else:
if filesystem is None:
filesystem = validate_coerce_filesystem(path, filesystem, storage_options)
to_parquet_args.update({"filesystem": filesystem})
pd_to_parquet(**to_parquet_args)
def read_parquet(
path: PathType,
columns: Optional[Iterable[str]] = None,
filesystem: Optional[fsspec.AbstractFileSystem] = None,
storage_options: Optional[Dict[str, Any]] = None,
engine_kwargs: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> GeoDataFrame:
engine_kwargs = engine_kwargs or {}
filesystem = validate_coerce_filesystem(path, filesystem, storage_options)
if LEGACY_PYARROW:
basic_kwargs = dict(validate_schema=False)
else:
basic_kwargs = dict(use_legacy_dataset=False)
# Load using pyarrow to handle parquet files and directories across filesystems
dataset = ParquetDataset(
path,
filesystem=filesystem,
**basic_kwargs,
**engine_kwargs,
**kwargs,
)
if LEGACY_PYARROW:
metadata = _load_parquet_pandas_metadata(
path,
filesystem=filesystem,
storage_options=storage_options,
engine_kwargs=engine_kwargs,
)
else:
metadata = dataset.schema.pandas_metadata
# If columns specified, prepend index columns to it
if columns is not None:
all_columns = set(column['name'] for column in metadata.get('columns', []))
index_col_metadata = metadata.get('index_columns', [])
extra_index_columns = []
for idx_metadata in index_col_metadata:
if isinstance(idx_metadata, str):
name = idx_metadata
elif isinstance(idx_metadata, dict):
name = idx_metadata.get('name', None)
else:
name = None
if name is not None and name not in columns and name in all_columns:
extra_index_columns.append(name)
columns = extra_index_columns + list(columns)
df = dataset.read(columns=columns).to_pandas()
# Return result
return GeoDataFrame(df)
def to_parquet_dask(
ddf: DaskGeoDataFrame,
path: PathType,
compression: Optional[str] = "snappy",

So for operations that don't use parquet, it should not be necessary to use python-snappy. Note that pandas does support other compression methods like gzip as mentioned at https://pandas.pydata.org/pandas-docs/version/2.0/reference/api/pandas.DataFrame.to_parquet.html, though snappy compression is currently the default.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Ideally, python-snappy would release Python 3.11 compatible wheels at intake/python-snappy#124, but the last commit on that repo was 17 Mar 2022, so not likely to happen anytime soon.

Additional context

Add any other context or screenshots about the feature request here.

I noticed that there was a PR checking for Python 3.11 compatibility at #113, but in that case, python-snappy was installed from conda-forge (that does support Python 3.11 https://anaconda.org/conda-forge/python-snappy/files?version=0.6.1) rather than PyPI.

For historical context, snappy was added as a required dependency in 498e7fc/#60.

Happy to open a PR to make python-snappy optional if the above sounds good!

`test_triangle_orientation` fails intermittently

The test below fails intermittently.

__________________________ test_triangle_orientation ___________________________

    @given(coord, coord, coord, coord, coord, coord)
>   @hyp_settings
    def test_triangle_orientation(ax, ay, bx, by, cx, cy):

spatialpandas/tests/geometry/algorithms/test_orientation.py:9: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ax = 0.0, ay = -4.440892098500627e-13, bx = 6.661338147750941e-13
by = 2.2204460492503136e-13, cx = 4.440892098500627e-13, cy = 0.0

    @given(coord, coord, coord, coord, coord, coord)
    @hyp_settings
    def test_triangle_orientation(ax, ay, bx, by, cx, cy):
        result = triangle_orientation(ax, ay, bx, by, cx, cy)
    
        # Use shapely polygon to compute expected orientation
        sg_poly = sg.Polygon([(ax, ay), (bx, by), (cx, cy), (ax, ay)])
    
        if sg_poly.area == 0:
            expected = 0
        else:
            expected = 1 if sg_poly.exterior.is_ccw else -1
    
>       assert result == expected
E       assert 0 == 1
E         +0
E         -1

spatialpandas/tests/geometry/algorithms/test_orientation.py:21: AssertionError
---------------------------------- Hypothesis ----------------------------------
Falsifying example: test_triangle_orientation(
    ax=0.0,
    ay=-4.440892098500627e-13,
    bx=6.661338147750941e-13,
    by=2.2204460492503136e-13,
    cx=4.440892098500627e-13,
    cy=0.0,
)

You can reproduce this example by temporarily adding @reproduce_failure('6.1.1', b'AXicY2CAAEYIxQTlMqMKw8UZAAEIAAo=') as a decorator on your test case

DaskGeoDataFrame parquet write error - Series object has no attribute total_bounds

Hi - I'm running into an error when trying to write a DaskGeoDataFrame. I'm following the basic pattern here (see also) but using a smaller sample of a point dataset. Everything seems to run as expected until trying to write out the packed file and I encounter the error below.

ALL software version info

pyarrow =15.0.0
spatialpandas=0.4.10
pandas=2.1.1
dask=2024.2.0
python=3.9.16

df = df.pack_partitions(npartitions=df.npartitions, shuffle='disk')
df.to_parquet(save_path)

image

image

read_parquet_dask fails to read from s3 glob

ALL software version info

software version info # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_llvm conda-forge abseil-cpp 20200225.2 he1b5a44_0 conda-forge adal 1.2.2 py_0 conda-forge aiohttp 3.6.2 py37h516909a_0 conda-forge alabaster 0.7.12 py_0 conda-forge alembic 1.4.2 pyh9f0ad1d_0 conda-forge appdirs 1.4.3 py_1 conda-forge arrow-cpp 0.17.0 py37hc11a6a2_0 conda-forge asn1crypto 1.3.0 py37hc8dfbb8_1 conda-forge astroid 2.3.3 pypi_0 pypi async-timeout 3.0.1 py_1000 conda-forge async_generator 1.10 py_0 conda-forge attrs 19.3.0 py_0 conda-forge aws-logging-handlers 2.0.3 pypi_0 pypi aws-sdk-cpp 1.7.164 hc831370_1 conda-forge awscli 1.18.56 py37hc8dfbb8_0 conda-forge babel 2.8.0 py_0 conda-forge backcall 0.1.0 py_0 conda-forge bandit 1.6.2 py37_0 conda-forge beautifulsoup4 4.9.0 py37hc8dfbb8_0 conda-forge black 19.10b0 py37_0 conda-forge bleach 3.1.5 pyh9f0ad1d_0 conda-forge blinker 1.4 py_1 conda-forge blosc 1.18.1 he1b5a44_0 conda-forge bokeh 1.4.0 py_0 bokeh boost-cpp 1.72.0 h8e57a91_0 conda-forge boto 2.49.0 py_0 conda-forge boto3 1.13.6 pyh9f0ad1d_0 conda-forge botocore 1.16.6 pyh9f0ad1d_0 conda-forge bottleneck 1.3.2 py37h03ebfcd_1 conda-forge brotli 1.0.7 he1b5a44_1001 conda-forge brotlipy 0.7.0 py37h8f50634_1000 conda-forge bzip2 1.0.8 h516909a_2 conda-forge c-ares 1.15.0 h516909a_1001 conda-forge ca-certificates 2020.4.5.1 hecc5488_0 conda-forge cachetools 3.1.1 py_0 conda-forge cairo 1.16.0 hcf35c78_1003 conda-forge cartopy 0.17.0 py37hd759880_1006 conda-forge certifi 2020.4.5.1 py37hc8dfbb8_0 conda-forge certipy 0.1.3 py_0 conda-forge cffi 1.14.0 py37hd463f26_0 conda-forge cfgv 3.1.0 py_0 conda-forge cfitsio 3.470 hb60a0a2_2 conda-forge cftime 1.1.2 py37h03ebfcd_0 conda-forge chardet 3.0.4 py37hc8dfbb8_1006 conda-forge click 7.1.2 pyh9f0ad1d_0 conda-forge click-plugins 1.1.1 py_0 conda-forge cligj 0.5.0 py_0 conda-forge cloudpickle 1.4.1 py_0 conda-forge colorama 0.4.3 py_0 conda-forge colorcet 2.0.2 py_0 pyviz/label/dev conda-lock 0.2.2 pypi_0 pypi configurable-http-proxy 4.2.1 node13_he01fd0c_0 conda-forge coverage 5.1 py37h8f50634_0 conda-forge croniter 0.3.30 py_0 conda-forge cryptography 2.9.2 py37hb09aad4_0 conda-forge cssselect 1.1.0 py_0 conda-forge curl 7.68.0 hf8cf82a_0 conda-forge cycler 0.10.0 py_2 conda-forge cython 0.29.17 py37h3340039_0 conda-forge cytoolz 0.10.1 py37h516909a_0 conda-forge dask 2.16.0 py_0 conda-forge dask-core 2.16.0 py_0 conda-forge dask-kubernetes 0.10.1 py_0 conda-forge dask-labextension 2.0.1 pypi_0 pypi datashader 0.11.0a4 py_0 pyviz/label/dev datashape 0.5.4 py_1 conda-forge datum 0.1.3 pypi_0 pypi dbus 1.13.6 he372182_0 conda-forge decorator 4.4.2 py_0 conda-forge defusedxml 0.6.0 py_0 conda-forge descartes 1.1.0 py_4 conda-forge distributed 2.16.0 py37hc8dfbb8_0 conda-forge docker-py 4.2.0 py37hc8dfbb8_0 conda-forge docker-pycreds 0.4.0 py_0 conda-forge docutils 0.15.2 py37_0 conda-forge dodgy 0.2.1 pypi_0 pypi editdistance 0.5.3 py37h3340039_0 conda-forge entrypoints 0.3 py37hc8dfbb8_1001 conda-forge et_xmlfile 1.0.1 py_1001 conda-forge expat 2.2.9 he1b5a44_2 conda-forge fastparquet 0.3.3 py37hc1659b7_0 conda-forge fiona 1.8.6 py37hf242f0b_3 conda-forge flake8 3.7.9 py37hc8dfbb8_1 conda-forge fontconfig 2.13.1 h86ecdb6_1001 conda-forge freetype 2.10.1 he06d7ca_0 conda-forge freexl 1.0.5 h14c3975_1002 conda-forge fribidi 1.0.9 h516909a_0 conda-forge fsspec 0.7.3 py_0 conda-forge funcsigs 1.0.2 py_3 conda-forge futures-compat 1.0 py3_0 conda-forge fuzzywuzzy 0.17.0 py_0 conda-forge fzf 0.21.1 h375a9b1_0 conda-forge gdal 2.4.1 py37h5f563d9_8 conda-forge geographiclib 1.50 py_0 conda-forge geopandas 0.6.3 py_0 conda-forge geopy 1.21.0 py_0 conda-forge geos 3.7.2 he1b5a44_2 conda-forge geotiff 1.4.3 hb6868eb_1001 conda-forge geoviews 1.7.0 py_0 pyviz/label/dev geoviews-core 1.7.0 py_0 pyviz/label/dev gettext 0.19.8.1 hc5be6a0_1002 conda-forge gflags 2.2.2 he1b5a44_1002 conda-forge giflib 5.1.7 h516909a_1 conda-forge gitdb 4.0.5 py_0 conda-forge gitpython 3.1.2 py_0 conda-forge glib 2.64.2 h6f030ca_0 conda-forge glog 0.4.0 h49b9bf7_3 conda-forge google-auth 1.14.2 pyh9f0ad1d_0 conda-forge googlemaps 2.5.1 py_0 conda-forge graphite2 1.3.13 he1b5a44_1001 conda-forge graphviz 2.42.3 h0511662_0 conda-forge grpc-cpp 1.28.1 h8e748ff_2 conda-forge gst-plugins-base 1.14.5 h0935bb2_2 conda-forge gstreamer 1.14.5 h36ae1b5_2 conda-forge h5py 2.10.0 nompi_py37h513d04c_102 conda-forge harfbuzz 2.4.0 h9f30f68_3 conda-forge haversine 2.2.0 py_0 conda-forge hdf4 4.2.13 hf30be14_1003 conda-forge hdf5 1.10.5 nompi_h3c11f04_1104 conda-forge heapdict 1.0.1 py_0 conda-forge holoviews 1.13.3a2 py_0 pyviz/label/dev html5lib 1.0.1 py_0 conda-forge hvplot 0.6.0rc2 py_0 pyviz/label/dev hypothesis 5.11.0 py_0 conda-forge icu 64.2 he1b5a44_1 conda-forge identify 1.4.15 pyh9f0ad1d_0 conda-forge idna 2.9 py_1 conda-forge imagecodecs-lite 2019.12.3 py37h8f50634_0 conda-forge imageio 2.8.0 py_0 conda-forge imagesize 1.2.0 py_0 conda-forge importlib-metadata 1.6.0 py37hc8dfbb8_0 conda-forge importlib_metadata 1.6.0 0 conda-forge importnb 0.6.1 py37hc8dfbb8_2 conda-forge ipykernel 5.2.1 py37h43977f1_0 conda-forge ipython 7.14.0 py37hc8dfbb8_0 conda-forge ipython_genutils 0.2.0 py_1 conda-forge ipywidgets 7.5.1 py_0 conda-forge isort 4.3.21 py37hc8dfbb8_1 conda-forge jdcal 1.4.1 py_0 conda-forge jedi 0.17.0 py37hc8dfbb8_0 conda-forge jinja2 2.11.2 pyh9f0ad1d_0 conda-forge jmespath 0.9.5 py_0 conda-forge joblib 0.14.1 py_0 conda-forge jpeg 9c h14c3975_1001 conda-forge json-c 0.13.1 h14c3975_1001 conda-forge json5 0.9.0 py_0 conda-forge jsonschema 3.2.0 py37hc8dfbb8_1 conda-forge jupyter 1.0.0 py_2 conda-forge jupyter-archive 0.5.5 py_0 conda-forge jupyter-server-proxy 1.4.0 py_0 conda-forge jupyter_bokeh 1.1.1 py_0 bokeh jupyter_client 5.3.4 py37_1 conda-forge jupyter_console 6.1.0 py_1 conda-forge jupyter_core 4.6.3 py37hc8dfbb8_1 conda-forge jupyter_telemetry 0.0.5 py_0 conda-forge jupyterhub 1.1.0 py37_2 conda-forge jupyterhub-base 1.1.0 py37_2 conda-forge jupyterlab 1.2.7 py_0 conda-forge jupyterlab-git 0.10.0 pypi_0 pypi jupyterlab-s3-browser 0.4.1 pypi_0 pypi jupyterlab_code_formatter 1.3.1 py_0 conda-forge jupyterlab_server 1.1.3 py_0 conda-forge kartothek 3.8.2 py_0 conda-forge kealib 1.4.13 hec59c27_0 conda-forge kiwisolver 1.2.0 py37h99015e2_0 conda-forge krb5 1.16.4 h2fd8d38_0 conda-forge kubernetes 1.18.2 0 conda-forge kubernetes-client 1.18.2 haa36a5b_0 conda-forge kubernetes-node 1.18.2 haa36a5b_0 conda-forge kubernetes-server 1.18.2 haa36a5b_0 conda-forge kubernetes_asyncio 11.2.0 pyh9f0ad1d_0 conda-forge lazy-object-proxy 1.4.3 pypi_0 pypi ld_impl_linux-64 2.34 h53a641e_0 conda-forge libblas 3.8.0 16_openblas conda-forge libcblas 3.8.0 16_openblas conda-forge libclang 9.0.1 default_hde54327_0 conda-forge libcurl 7.68.0 hda55be3_0 conda-forge libdap4 3.20.4 hd3bb157_0 conda-forge libedit 3.1.20170329 hf8c457e_1001 conda-forge libevent 2.1.10 h72c5cf5_0 conda-forge libffi 3.2.1 he1b5a44_1007 conda-forge libgcc-ng 9.2.0 h24d8f2e_2 conda-forge libgdal 2.4.1 heae24aa_8 conda-forge libgfortran-ng 7.3.0 hdf63c60_5 conda-forge libiconv 1.15 h516909a_1006 conda-forge libkml 1.3.0 hb574062_1011 conda-forge liblapack 3.8.0 16_openblas conda-forge libllvm8 8.0.1 hc9558a2_0 conda-forge libllvm9 9.0.1 he513fc3_1 conda-forge libnetcdf 4.6.2 h303dfb8_1003 conda-forge libopenblas 0.3.9 h5ec1e0e_0 conda-forge libpng 1.6.37 hed695b0_1 conda-forge libpostal 1.1.0 h38d415c_2 conda-forge libpq 11.5 hd9ab2ff_2 conda-forge libprotobuf 3.11.4 h8b12597_0 conda-forge libsodium 1.0.17 h516909a_0 conda-forge libspatialindex 1.9.3 he1b5a44_3 conda-forge libspatialite 4.3.0a h79dc798_1030 conda-forge libssh2 1.8.2 h22169c7_2 conda-forge libstdcxx-ng 9.2.0 hdf63c60_2 conda-forge libtiff 4.1.0 hc7e4089_6 conda-forge libtool 2.4.6 h14c3975_1002 conda-forge libuuid 2.32.1 h14c3975_1000 conda-forge libuv 1.34.0 h516909a_0 conda-forge libwebp-base 1.1.0 h516909a_3 conda-forge libxcb 1.13 h14c3975_1002 conda-forge libxkbcommon 0.10.0 he1b5a44_0 conda-forge libxml2 2.9.10 hee79883_0 conda-forge libxslt 1.1.33 h31b3aaa_0 conda-forge llvm-openmp 10.0.0 hc9558a2_0 conda-forge llvmlite 0.31.0 py37h5202443_1 conda-forge locket 0.2.0 py_2 conda-forge loguru 0.4.1 py37_0 conda-forge lxml 4.5.0 py37he3881c9_1 conda-forge lz4-c 1.9.2 he1b5a44_1 conda-forge lzo 2.10 h14c3975_1000 conda-forge mako 1.1.0 py_0 conda-forge markdown 3.2.1 py_0 conda-forge markupsafe 1.1.1 py37h8f50634_1 conda-forge marshmallow 3.6.0 py_0 conda-forge marshmallow-oneofschema 2.0.1 py_0 conda-forge matplotlib 3.2.1 0 conda-forge matplotlib-base 3.2.1 py37h30547a4_0 conda-forge mccabe 0.6.1 py_1 conda-forge milksnake 0.1.5 py_0 conda-forge mistune 0.8.4 py37h8f50634_1001 conda-forge mock 4.0.2 py37hc8dfbb8_0 conda-forge more-itertools 8.2.0 py_0 conda-forge msgpack-python 1.0.0 py37h99015e2_1 conda-forge multidict 4.7.5 py37h8f50634_1 conda-forge multipledispatch 0.6.0 py_0 conda-forge munch 2.5.0 py_0 conda-forge mypy 0.770 py_0 conda-forge mypy_extensions 0.4.3 py37hc8dfbb8_1 conda-forge nb_conda_kernels 2.2.3 py37_0 conda-forge nbconvert 5.6.1 py37hc8dfbb8_1 conda-forge nbdime 1.1.0 pypi_0 pypi nbformat 5.0.6 py_0 conda-forge nbval 0.9.5 py_0 conda-forge ncurses 6.1 hf484d3e_1002 conda-forge netcdf4 1.5.1.2 py37h73a1b54_1 conda-forge networkx 2.4 py_1 conda-forge nodeenv 1.3.5 py_0 conda-forge nodejs 13.13.0 hf5d1a2b_0 conda-forge notebook 6.0.3 py37hc8dfbb8_0 conda-forge nspr 4.25 he1b5a44_0 conda-forge nss 3.47 he751ad9_0 conda-forge numba 0.48.0 py37hb3f55d8_0 conda-forge numexpr 2.7.1 py37h0da4684_1 conda-forge numpy 1.18.4 py37h8960a57_0 conda-forge oauthlib 3.0.1 py_0 conda-forge olefile 0.46 py_0 conda-forge openjpeg 2.3.1 h981e76c_3 conda-forge openpyxl 3.0.3 py_0 conda-forge openssl 1.1.1g h516909a_0 conda-forge owslib 0.19.2 py_0 conda-forge packaging 20.1 py_0 conda-forge pamela 1.0.0 py_0 conda-forge pandas 1.0.3 py37h0da4684_1 conda-forge pandoc 2.9.2.1 0 conda-forge pandocfilters 1.4.2 py_1 conda-forge panel 0.8.3 py_0 pyviz/label/dev pango 1.42.4 h7062337_4 conda-forge param 1.10.0a2 py_0 pyviz/label/dev parquet-cpp 1.5.1 2 conda-forge parso 0.7.0 pyh9f0ad1d_0 conda-forge partd 1.1.0 py_0 conda-forge pathspec 0.8.0 pyh9f0ad1d_0 conda-forge patsy 0.5.1 py_0 conda-forge pbr 5.4.2 py_0 conda-forge pcre 8.44 he1b5a44_0 conda-forge pendulum 2.1.0 py37hc8dfbb8_1 conda-forge pep8-naming 0.4.1 pypi_0 pypi pexpect 4.8.0 py37hc8dfbb8_1 conda-forge pickleshare 0.7.5 py37hc8dfbb8_1001 conda-forge pillow 7.1.2 py37h718be6c_0 conda-forge pip 20.1 pyh9f0ad1d_0 conda-forge pixman 0.38.0 h516909a_1003 conda-forge pluggy 0.13.1 py37hc8dfbb8_1 conda-forge polygon-geohasher 0.0.1 pypi_0 pypi pooch 1.1.0 py_0 conda-forge poppler 0.67.0 h14e79db_8 conda-forge poppler-data 0.4.9 1 conda-forge postal 1.1.8 pypi_0 pypi postgresql 11.5 hc63931a_2 conda-forge pre-commit 2.3.0 py37hc8dfbb8_0 conda-forge prefect 0.10.7 py_0 conda-forge proj4 5.2.0 he1b5a44_1006 conda-forge prometheus_client 0.7.1 py_0 conda-forge prompt-toolkit 3.0.5 py_0 conda-forge prompt_toolkit 3.0.5 0 conda-forge prospector 1.2.0 pypi_0 pypi psutil 5.7.0 py37h8f50634_1 conda-forge pthread-stubs 0.4 h14c3975_1001 conda-forge ptyprocess 0.6.0 py_1001 conda-forge py 1.8.1 py_0 conda-forge pyarrow 0.17.0 py37h110162e_0 conda-forge pyasn1 0.4.8 py_0 conda-forge pyasn1-modules 0.2.7 py_0 conda-forge pycodestyle 2.4.0 pypi_0 pypi pycparser 2.20 py_0 conda-forge pyct 0.4.6 py_0 pyviz/label/dev pyct-core 0.4.6 py_0 pyviz/label/dev pycurl 7.43.0.5 py37h16ce93b_0 conda-forge pydocstyle 5.0.2 py_0 conda-forge pyepsg 0.4.0 py_0 conda-forge pyflakes 2.1.1 py_0 conda-forge pygments 2.6.1 py_0 conda-forge pyjwt 1.7.1 py_0 conda-forge pykdtree 1.3.1 py37h03ebfcd_1003 conda-forge pylama 7.7.1 pypi_0 pypi pylint 2.4.4 pypi_0 pypi pylint-celery 0.3 pypi_0 pypi pylint-django 2.0.12 pypi_0 pypi pylint-flask 0.6 pypi_0 pypi pylint-plugin-utils 0.6 pypi_0 pypi pyopenssl 19.1.0 py_1 conda-forge pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge pyproj 1.9.6 py37h516909a_1002 conda-forge pyqt 5.12.3 py37h8685d9f_3 conda-forge pyqt5-sip 4.19.18 pypi_0 pypi pyqtchart 5.12 pypi_0 pypi pyqtwebengine 5.12.1 pypi_0 pypi pyroma 2.6 pypi_0 pypi pyrsistent 0.16.0 py37h8f50634_0 conda-forge pyshp 2.1.0 py_0 conda-forge pysocks 1.7.1 py37hc8dfbb8_1 conda-forge pytables 3.6.1 py37h9f153d1_1 conda-forge pytest 5.4.2 py37hc8dfbb8_0 conda-forge pytest-cov 2.8.1 py_0 conda-forge pytest-vcr 1.0.2 pypi_0 pypi python 3.7.6 h8356626_5_cpython conda-forge python-blosc 1.9.1 py37h0da4684_0 conda-forge python-box 4.2.3 py_0 conda-forge python-dateutil 2.8.1 py_0 conda-forge python-docx 0.8.10 pypi_0 pypi python-dotenv 0.13.0 pyh9f0ad1d_0 conda-forge python-editor 1.0.4 py_0 conda-forge python-geohash 0.8.5 py37he1b5a44_0 conda-forge python-graphviz 0.14 pyh9f0ad1d_0 conda-forge python-json-logger 0.1.11 py_0 conda-forge python-kubernetes 11.0.0 py37hc8dfbb8_0 conda-forge python-levenshtein 0.12.0 py37h516909a_1001 conda-forge python-slugify 4.0.0 pyh9f0ad1d_1 conda-forge python-snappy 0.5.4 py37h7cfaab3_1 conda-forge python_abi 3.7 1_cp37m conda-forge pytz 2020.1 pyh9f0ad1d_0 conda-forge pytzdata 2019.3 py_0 conda-forge pyviz_comms 0.7.4 py_0 pyviz/label/dev pywavelets 1.1.1 py37h03ebfcd_1 conda-forge pyyaml 5.1.2 py37h516909a_0 conda-forge pyzmq 19.0.1 py37hac76be4_0 conda-forge qgrid 1.3.0 pypi_0 pypi qt 5.12.5 hd8c4c69_1 conda-forge qtconsole 4.7.3 pyh9f0ad1d_0 conda-forge qtpy 1.9.0 py_0 conda-forge re2 2020.05.01 he1b5a44_0 conda-forge readline 8.0 hf8c457e_0 conda-forge regex 2020.5.7 py37h8f50634_0 conda-forge requests 2.23.0 pyh8c360ce_2 conda-forge requests-oauthlib 1.2.0 py_0 conda-forge requirements-detector 0.6 pypi_0 pypi retrying 1.3.3 py_2 conda-forge rsa 3.4.2 py_1 conda-forge rtree 0.9.4 py37h8526d28_1 conda-forge ruamel.yaml 0.16.6 py37h8f50634_1 conda-forge ruamel.yaml.clib 0.2.0 py37h8f50634_1 conda-forge s3fs 0.4.2 py_0 conda-forge s3transfer 0.3.3 py37hc8dfbb8_1 conda-forge scikit-image 0.17.1 py37h0da4684_0 conda-forge scikit-learn 0.22.2.post1 py37hcdab131_0 conda-forge scipy 1.4.1 py37ha3d9a3c_3 conda-forge seaborn 0.10.1 py_0 conda-forge send2trash 1.5.0 py_0 conda-forge setoptconf 0.2.0 pypi_0 pypi setuptools 46.1.3 py37hc8dfbb8_0 conda-forge shapely 1.6.4 py37hec07ddf_1006 conda-forge simpervisor 0.3 py_1 conda-forge simplejson 3.17.0 py37h8f50634_1 conda-forge simplekv 0.14.1 pyh9f0ad1d_0 conda-forge singleton-decorator 1.0.0 pypi_0 pypi six 1.14.0 py_1 conda-forge smartystreets-python-sdk 4.4.1 pypi_0 pypi smmap 3.0.4 pyh9f0ad1d_0 conda-forge snappy 1.1.8 he1b5a44_1 conda-forge snowballstemmer 2.0.0 py_0 conda-forge sortedcontainers 2.1.0 py_0 conda-forge soupsieve 1.9.4 py37hc8dfbb8_1 conda-forge spatialpandas 0.3.4.post2+gefdabe5 pypi_0 pypi sphinx 3.0.3 py_0 conda-forge sphinxcontrib-applehelp 1.0.2 py_0 conda-forge sphinxcontrib-devhelp 1.0.2 py_0 conda-forge sphinxcontrib-htmlhelp 1.0.3 py_0 conda-forge sphinxcontrib-jsmath 1.0.1 py_0 conda-forge sphinxcontrib-qthelp 1.0.3 py_0 conda-forge sphinxcontrib-serializinghtml 1.1.4 py_0 conda-forge sqlalchemy 1.3.16 py37h8f50634_0 conda-forge sqlite 3.30.1 hcee41ef_0 conda-forge statsmodels 0.11.1 py37h8f50634_1 conda-forge stevedore 1.30.1 py_0 conda-forge storefact 0.10.0 py_0 conda-forge tabulate 0.8.7 pyh9f0ad1d_0 conda-forge tblib 1.6.0 py_0 conda-forge terminado 0.8.3 py37hc8dfbb8_1 conda-forge testpath 0.4.4 py_0 conda-forge text-unidecode 1.3 py_0 conda-forge thrift 0.11.0 py37he1b5a44_1001 conda-forge thrift-cpp 0.13.0 h62aa4f2_2 conda-forge tifffile 2020.5.7 py_0 conda-forge tk 8.6.10 hed695b0_0 conda-forge toml 0.10.0 py_0 conda-forge toolz 0.10.0 py_0 conda-forge tornado 6.0.4 py37h8f50634_1 conda-forge tqdm 4.46.0 pyh9f0ad1d_0 conda-forge traitlets 4.3.3 py37hc8dfbb8_1 conda-forge typed-ast 1.4.1 py37h516909a_0 conda-forge typing_extensions 3.7.4.2 py_0 conda-forge tzcode 2020a h516909a_0 conda-forge unidecode 1.1.1 py_0 conda-forge uritools 3.0.0 py37hc8dfbb8_1 conda-forge urllib3 1.25.9 py_0 conda-forge urlquote 1.1.4 py37hc8dfbb8_1 conda-forge vcrpy 4.0.2 py_0 conda-forge virtualenv 16.7.5 py_0 conda-forge watermark 2.0.2 py_0 conda-forge wcwidth 0.1.9 pyh9f0ad1d_0 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 0.57.0 py37hc8dfbb8_1 conda-forge wheel 0.34.2 py_1 conda-forge widgetsnbextension 3.5.1 py37_0 conda-forge wrapt 1.11.2 pypi_0 pypi xarray 0.15.1 py_0 conda-forge xerces-c 3.2.2 h8412b87_1004 conda-forge xlrd 1.2.0 py_0 conda-forge xlsxwriter 1.2.8 py_0 conda-forge xlwt 1.3.0 py_1 conda-forge xorg-kbproto 1.0.7 h14c3975_1002 conda-forge xorg-libice 1.0.10 h516909a_0 conda-forge xorg-libsm 1.2.3 h84519dc_1000 conda-forge xorg-libx11 1.6.9 h516909a_0 conda-forge xorg-libxau 1.0.9 h14c3975_0 conda-forge xorg-libxdmcp 1.1.3 h516909a_0 conda-forge xorg-libxext 1.3.4 h516909a_0 conda-forge xorg-libxpm 3.5.13 h516909a_0 conda-forge xorg-libxrender 0.9.10 h516909a_1002 conda-forge xorg-libxt 1.1.5 h516909a_1003 conda-forge xorg-renderproto 0.11.1 h14c3975_1002 conda-forge xorg-xextproto 7.3.0 h14c3975_1002 conda-forge xorg-xproto 7.0.31 h14c3975_1007 conda-forge xz 5.2.5 h516909a_0 conda-forge yaml 0.1.7 h14c3975_1001 conda-forge yapf 0.29.0 py_0 conda-forge yarl 1.3.0 py37h516909a_1000 conda-forge zeromq 4.3.2 he1b5a44_2 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.1.0 py_0 conda-forge zlib 1.2.11 h516909a_1006 conda-forge zstandard 0.13.0 py37he1b5a44_0 conda-forge zstd 1.4.4 h6597ccf_3 conda-forge

Description of expected behavior and the observed behavior

Attempted to read a parquet file using spatialpandas read_parquet_dask.
ddf = read_parquet_dask("s3://bucket/path/*/*.parquet")

Expected successful completion.

Observed behavior:
Failed with error.
ParamValidationError: Parameter validation failed: Invalid bucket name "['s3', 's3a']

Complete, minimal, self-contained example code that reproduces the issue

ddf = read_parquet_dask("s3://bucket/path/*/*.parquet")

Stack traceback and/or browser JavaScript console output

ParamValidationError: Parameter validation failed: Invalid bucket name "['s3', 's3a'] --------------------------------------------------------------------------- ParamValidationError Traceback (most recent call last) in ----> 1 ddf = read_parquet_dask(output + "/*/*.parquet") 2 print(type(ddf)) 3 print(len(ddf))

~/miniconda3/envs/default/lib/python3.7/site-packages/spatialpandas/io/parquet.py in read_parquet_dask(path, columns, filesystem, load_divisions, geometry, bounds, categories)
232 path, columns, filesystem,
233 load_divisions=load_divisions, geometry=geometry, bounds=bounds,
--> 234 categories=categories
235 )
236

~/miniconda3/envs/default/lib/python3.7/site-packages/spatialpandas/io/parquet.py in _perform_read_parquet_dask(paths, columns, filesystem, load_divisions, geometry, bounds, categories)
254 datasets = [pa.parquet.ParquetDataset(
255 path, filesystem=filesystem, validate_schema=False
--> 256 ) for path in paths]
257
258 # Create delayed partition for each piece

~/miniconda3/envs/default/lib/python3.7/site-packages/spatialpandas/io/parquet.py in (.0)
254 datasets = [pa.parquet.ParquetDataset(
255 path, filesystem=filesystem, validate_schema=False
--> 256 ) for path in paths]
257
258 # Create delayed partition for each piece

~/miniconda3/envs/default/lib/python3.7/site-packages/pyarrow/parquet.py in init(self, path_or_paths, filesystem, schema, metadata, split_row_groups, validate_schema, filters, metadata_nthreads, read_dictionary, memory_map, buffer_size, partitioning, use_legacy_dataset)
1157 self.metadata_path) = _make_manifest(
1158 path_or_paths, self.fs, metadata_nthreads=metadata_nthreads,
-> 1159 open_file_func=partial(_open_dataset_file, self._metadata)
1160 )
1161

~/miniconda3/envs/default/lib/python3.7/site-packages/pyarrow/parquet.py in _make_manifest(path_or_paths, fs, pathsep, metadata_nthreads, open_file_func)
1331 path_or_paths = path_or_paths[0]
1332
-> 1333 if _is_path_like(path_or_paths) and fs.isdir(path_or_paths):
1334 manifest = ParquetManifest(path_or_paths, filesystem=fs,
1335 open_file_func=open_file_func,

~/miniconda3/envs/default/lib/python3.7/site-packages/s3fs/core.py in isdir(self, path)
599
600 # This only returns things within the path and NOT the path object itself
--> 601 return bool(self._lsdir(path))
602
603 def ls(self, path, detail=False, refresh=False, **kwargs):

~/miniconda3/envs/default/lib/python3.7/site-packages/s3fs/core.py in _lsdir(self, path, refresh, max_items)
392 files = []
393 dircache = []
--> 394 for i in it:
395 dircache.extend(i.get('CommonPrefixes', []))
396 for c in i.get('Contents', []):

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/paginate.py in iter(self)
253 self._inject_starting_params(current_kwargs)
254 while True:
--> 255 response = self._make_request(current_kwargs)
256 parsed = self._extract_parsed_response(response)
257 if first_request:

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/paginate.py in _make_request(self, current_kwargs)
330
331 def _make_request(self, current_kwargs):
--> 332 return self._method(**current_kwargs)
333
334 def _extract_parsed_response(self, response):

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
314 "%s() only accepts keyword arguments." % py_operation_name)
315 # The "self" in this scope is referring to the BaseClient.
--> 316 return self._make_api_call(operation_name, kwargs)
317
318 _api_call.name = str(py_operation_name)

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
597 }
598 request_dict = self._convert_to_request_dict(
--> 599 api_params, operation_model, context=request_context)
600
601 service_id = self._service_model.service_id.hyphenize()

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/client.py in _convert_to_request_dict(self, api_params, operation_model, context)
643 context=None):
644 api_params = self._emit_api_params(
--> 645 api_params, operation_model, context)
646 request_dict = self._serializer.serialize_to_request(
647 api_params, operation_model)

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/client.py in _emit_api_params(self, api_params, operation_model, context)
675 service_id=service_id,
676 operation_name=operation_name),
--> 677 params=api_params, model=operation_model, context=context)
678 return api_params
679

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
354 def emit(self, event_name, **kwargs):
355 aliased_event_name = self._alias_event_name(event_name)
--> 356 return self._emitter.emit(aliased_event_name, **kwargs)
357
358 def emit_until_response(self, event_name, **kwargs):

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
226 handlers.
227 """
--> 228 return self._emit(event_name, kwargs)
229
230 def emit_until_response(self, event_name, **kwargs):

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/hooks.py in _emit(self, event_name, kwargs, stop_on_response)
209 for handler in handlers_to_call:
210 logger.debug('Event %s: calling handler %s', event_name, handler)
--> 211 response = handler(**kwargs)
212 responses.append((handler, response))
213 if stop_on_response and response is not None:

~/miniconda3/envs/default/lib/python3.7/site-packages/botocore/handlers.py in validate_bucket_name(params, **kwargs)
234 'the regex "%s" or be an ARN matching the regex "%s"' % (
235 bucket, VALID_BUCKET.pattern, VALID_S3_ARN.pattern))
--> 236 raise ParamValidationError(report=error_msg)
237
238

ParamValidationError: Parameter validation failed:
Invalid bucket name "['s3', 's3a']:": Bucket name must match the regex "^[a-zA-Z0-9.-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9-]{1,63}$"

This shares the same cause as issue #32. The function _maybe_prepend_protocol attempts to prepend the protocol, but for s3 it now attempts to prepend a protocol list "['s3', 's3a']:", which results in an invalid URL.

The relevant code in https://github.com/holoviz/spatialpandas/blob/master/spatialpandas/io/parquet.py :

    # Expand glob
    if len(path) == 1 and ('*' in path[0] or '?' in path[0] or '[' in path[0]):
        path = filesystem.glob(path[0])
        path = _maybe_prepend_protocol(path, filesystem)

...
def _maybe_prepend_protocol(paths, filesystem):
    if filesystem.protocol not in ("file", "abstract"):
        # Add back prefix (e.g. s3://)
        paths = [
            "{proto}://{p}".format(proto=filesystem.protocol, p=p) for p in paths
        ]
    return paths

For now the workaround is the expand the glob and prepend the protocol separately before passing the paths into the function.

ENH: Fixed width types

So far, spatialpandas supports "ragged" geometry types where the representation of the geometry objects in each row may differ in length (e.g. polygons with variable number of vertices). These types are backed by a pyarrow ListArray.

It would also be nice to provide a more efficient representation of fixed size geometry objects. In particular, to represent a single point per row. Other use-cases would be to represent axis aligned boxes using two points.

One way to represent these would be to use pyarrow extension types backed by a fixed width binary storage type.

@jorisvandenbossche does this sound like a reasonable way to handle fixed length geometry types with pyarrow? Or would there be anything more straightforward?

Fixes required for new release

I am planning to get this into a state suitable for a new release. There are many test failures:

  • Pandas extension array related failures.
  • Parquet related failures (#84 may be relevant).
  • Geometry intersection issues (#61 may be relevant).
  • Shapely deprecation warnings, partly fixed by #85.

I am also planning to drop Python 3.6 support and add 3.9 and 3.10.

This is too much to do in a single PR so I will do multiple targetted PRs which will all be failing part of CI until they are all combined.

Graceful handling of NoneTypes

An error is generated when converting a GeoPandas dataframe containing NoneTypes in the geometry column to a spatialpandas GeoDataFrame. For example:
ValueError: Received invalid value of type NoneType. Must be an instance of Point, or MultiPoint

It would be nice if NoneTypes were handled gracefully.

More intuitive error when lsuffix == rsuffix on sjoin

Right now, callling joined = spd.sjoin(sddf, sdf, how='inner', lsuffix='leftsuffix', rsuffix='rightsuffix') results in RecursionError: maximum recursion depth exceeded while calling a Python object.

It'd be nice to have a Exception message saying something like "lsuffix and rsuffix must not be equal"

Tests in `test_parquet` failing with `TypeError: argument of type 'PosixPath' is not iterable`

Multiple tests in test_parquet are failing with the same error message, and example of which is below:

____________________________________________________________ test_parquet ____________________________________________________________

tmp_path = PosixPath('/tmp/pytest-of-blarsen/pytest-0/test_parquet0')

    @given(
>       gp_point=st_point_array(min_size=1, geoseries=True),
        gp_multipoint=st_multipoint_array(min_size=1, geoseries=True),
        gp_multiline=st_multiline_array(min_size=1, geoseries=True),
    )
    @hyp_settings
    def test_parquet(gp_point, gp_multipoint, gp_multiline, tmp_path):

tests/test_parquet.py:24:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/test_parquet.py:43: in test_parquet
    df_read = read_parquet(path, columns=['point', 'multipoint', 'multiline', 'a'])
spatialpandas/io/parquet.py:103: in read_parquet
    filesystem = validate_coerce_filesystem(path, filesystem)
spatialpandas/io/utils.py:19: in validate_coerce_filesystem
    return fsspec.open(path).fs
../../miniconda3/envs/default/lib/python3.7/site-packages/fsspec/core.py:378: in open
    **kwargs
../../miniconda3/envs/default/lib/python3.7/site-packages/fsspec/core.py:222: in open_files
    chain = _un_chain(urlpath, kwargs)
../../miniconda3/envs/default/lib/python3.7/site-packages/fsspec/core.py:269: in _un_chain
    bits = [_un_chain(p, kwargs) for p in path]
../../miniconda3/envs/default/lib/python3.7/site-packages/fsspec/core.py:269: in <listcomp>
    bits = [_un_chain(p, kwargs) for p in path]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/tmp/pytest-of-blarsen/pytest-0/test_parquet0/df.parq'), kwargs = {}

    def _un_chain(path, kwargs):
        if isinstance(path, (tuple, list)):
            bits = [_un_chain(p, kwargs) for p in path]
            out = []
            for pbit in zip(*bits):
                paths, protocols, kwargs = zip(*pbit)
                if len(set(protocols)) > 1:
                    raise ValueError("Protocol mismatch in URL chain")
                if len(set(paths)) == 1:
                    paths = paths[0]
                else:
                    paths = list(paths)
                out.append([paths, protocols[0], kwargs[0]])
            return out
        x = re.compile(".*[^a-z]+.*")  # test for non protocol-like single word
        bits = (
            [p if "://" in p or x.match(p) else p + "://" for p in path.split("::")]
>           if "::" in path
            else [path]
        )
E       TypeError: argument of type 'PosixPath' is not iterable

../../miniconda3/envs/default/lib/python3.7/site-packages/fsspec/core.py:284: TypeError

It appears that fsspec does not support Path objects. A simple fix would be to cast the Path object to str before passing to the function being tested.

Ring.to_shapely not working

The to_shapely method on Ring geometries does not seem to work:

from spatialpandas.geometry import Ring

Ring([0, 0, 1, 1, 2, 0, 0, 0]).to_shapely()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-70-4983d69e78fb> in <module>
      1 from spatialpandas.geometry import Ring
      2 
----> 3 Ring([0, 0, 1, 1, 2, 0, 0, 0]).to_shapely()

~/development/spatialpandas/spatialpandas/geometry/ring.py in to_shapely(self)
     30         """
     31         import shapely.geometry as sg
---> 32         line_coords = self.data.to_numpy()
     33         return sg.LinearRing(line_coords.reshape(len(line_coords) // 2, 2))
     34 

AttributeError: 'pyarrow.lib.ListValue' object has no attribute 'to_numpy'

Versions:

  • spatialpandas 0.1
  • pyarrow 0.15.1

linearrings TypeError using shapely 2.0

Running CI using the recently released shapely 2.0 reveals two failures:

spatialpandas/tests/geometry/test_to_geopandas.py::test_polygon_array_to_geopandas FAILED                    [ 85%]
spatialpandas/tests/geometry/test_to_geopandas.py::test_multipolygon_array_to_geopandas FAILED               [100%]

with the error message

E           TypeError: ufunc 'linearrings' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
E           Falsifying example: test_polygon_array_to_geopandas(
E               gp_polygon=0    POLYGON ((-40.58410 -3.68172, -25.93910 27.960...
E               dtype: geometry,
E           )

../../.miniconda/envs/sp/lib/python3.10/site-packages/shapely/creation.py:173: TypeError

This is a spatialpandas bug rather than a shapely one, caused by some inappropriate of np.asarray(..., dtype=object) which was introduced recently to address numpy deprecation warnings about converting ragged arrays.

Pandas 2.2.0 rc0 emit warning

When trying out Pandas 2.2.0rc0 in holoviz/holoviews#6074, I found the following warning started to be emitted: Passing a SingleBlockManager to GeoSeries is deprecated and will raise in a future version. Use public APIs instead.

Integrate with Kartothek

Is your feature request related to a problem? Please describe.

SpatialPandas helps spatially sort data but we are seeing the need for higher level arbitrary indexing. Two example use cases:

  • Geospatial. We have spatially sorted daily GPS data for the US for multiple days. Getting a small region for a 60-90 day process can get bogged down by the need to read the 60-90 multiple metadata files and construct the task graph.

  • Astronomy. We have spatial data for multiple filters (HSC-Y, HSC-G etc). Again we would have to read multiple metadata files.

Describe the solution you'd like

The above could be fixed by building higher level indexes. I think we can benefit from integrating with kartothek. It enables an O(1) index and creates the necessary task graphs for reading just the partitions required. It could also be used to store the extra metadata spatialpandas currently stores in its own format (if I'm understanding spatialpandas correctly)

I'm at the Dask Dev Conference with some of the Kartothek devs and based on conversations with fjetter this integration should be possible.

`pack_partitions_to_parquet` stalling on concat_parts.

ALL software version info

(this library, plus any other relevant software, e.g. bokeh, python, notebook, OS, browser, etc)
python 3.8 ( I think I tried 3.7 as well, need to check)
spatialpandas 0.3.5
dask 2.11.0
distributed 2.11.0

Description of expected behavior and the observed behavior

Attempting to read in some astronomy data, create a spatial index and then pack the spatially sorted data to parquet.

pack_partitions_to_parquet keeps stalling on concat_parts everything seems to process fine till then.

It does create a directory structure with the correct number of partition folders and it does create a small parquet file i.e.

test_sorted/part.0.parquet/part.0.parquet
test_sorted/part.1.parquet/part.0.parquet
...

some folders have multiple parts but still tiny.

Complete, minimal, self-contained example code that reproduces the issue

Unzip the parquet file below:
test.parq.zip

# code goes here between backticks
import dask.dataframe as dd
from dask.distributed import Client, LocalCluster

from spatialpandas import GeoDataFrame
from spatialpandas.geometry import PointArray

# mainly to get a dask diagnostic dashboard
cluster = LocalCluster()
client = Client(cluster)
client

ddf = dd.read_parquet('test.parq')
ddf2 = ddf.map_partitions(
        lambda df: GeoDataFrame(dict(
            position=PointArray(df[['ra', 'dec']]),
            **{col: df[col] for col in df.columns}
        ))
    )
ddf_packed = ddf2.pack_partitions_to_parquet('test_sorted.parq')

Screenshots or screencasts of the bug in action

image

GeometryArray.__geitem__ does not preserve dtype

When indexing a GeometryArray, the buffer_values (and flat_values) of the returned Geometry object does not have the same dtype as the respective array:

In [1]: from spatialpandas.geometry import LineArray

In [2]: line_array = LineArray([
   ...:     [1, 2, 3, 4, 5, 6],
   ...:     [7, 8, 9, 10],
   ...:     [11, 12, 13, 14, 15, 16],
   ...: ], dtype="i2")

In [3]: line_array.buffer_values.dtype
Out[3]: dtype('int16')

In [4]: line_array[1].buffer_values.dtype
Out[4]: dtype('int64')

In [5]: line_array[1].flat_values.dtype
Out[5]: dtype('int64')

Tests failing with `ValueError: Cannot mask with a boolean indexer containing NA values`

The following tests fail with this error: ValueError: Cannot mask with a boolean indexer containing NA values

TestGeometryGetitem.test_getitem_boolean_na_treated_as_false
self = <tests.test_fixedextensionarray.TestGeometryGetitem object at 0x7fe965159410>
data = <PointArray>
[  Point([0.0, 1.0]),   Point([1.0, 2.0]),   Point([3.0, 4.0]),
                None, Point([-1.0, -2.0])...  Point([1.0, 2.0]),   Point([3.0, 4.0]),                None,
 Point([-1.0, -2.0])]
Length: 100, dtype: point[float64]

    def test_getitem_boolean_na_treated_as_false(self, data):
        # https://github.com/pandas-dev/pandas/issues/31503
        mask = pd.array(np.zeros(data.shape, dtype="bool"), dtype="boolean")
        mask[:2] = pd.NA
        mask[2:4] = True

>       result = data[mask]

../../miniconda3/envs/default/lib/python3.7/site-packages/pandas/tests/extension/base/getitem.py:167:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <PointArray>
[  Point([0.0, 1.0]),   Point([1.0, 2.0]),   Point([3.0, 4.0]),
                None, Point([-1.0, -2.0])...  Point([1.0, 2.0]),   Point([3.0, 4.0]),                None,
 Point([-1.0, -2.0])]
Length: 100, dtype: point[float64]
item = <BooleanArray>
[ <NA>,  <NA>,  True,  True, False, False, False, False, False, False, False,
 False, False, False, Fal...alse,
 False, False, False, False, False, False, False, False, False, False, False,
 False]
Length: 100, dtype: boolean

    def __getitem__(self, item):
        err_msg = ("Only integers, slices and integer or boolean"
                   "arrays are valid indices.")
        if isinstance(item, Integral):
            item = int(item)
            if item < -len(self) or item >= len(self):
                raise IndexError("{item} is out of bounds".format(item=item))
            else:
                # Convert negative item index
                if item < 0:
                    item += len(self)

                value = self.data[item].as_py()
                if value is not None:
                    return self._element_type(value, self.numpy_dtype)
                else:
                    return None
        elif isinstance(item, slice):
            if item.step is None or item.step == 1:
                # pyarrow only supports slice with step of 1
                return self.__class__(self.data[item], dtype=self.dtype)
            else:
                selected_indices = np.arange(len(self))[item]
                return self.take(selected_indices, allow_fill=False)
        elif isinstance(item, Iterable):
            if isinstance(item, (np.ndarray, ExtensionArray)):
                # Leave numpy and pandas arrays alone
                kind = item.dtype.kind
            else:
                item = pd.array(item)
                kind = item.dtype.kind

            if len(item) == 0:
                return self.take([], allow_fill=False)
            elif kind == 'b':
                # Check mask length is compatible
                if len(item) != len(self):
                    raise IndexError(
                        "boolean mask length ({}) doesn't match array length ({})"
                        .format(len(item), len(self))
                    )

                # check for NA values
                if any(pd.isna(item)):
                    raise ValueError(
>                       "Cannot mask with a boolean indexer containing NA values"
                    )
E                   ValueError: Cannot mask with a boolean indexer containing NA values

spatialpandas/geometry/base.py:385: ValueError
TestGeometryGetitem.test_getitem_boolean_na_treated_as_false
self = <tests.test_listextensionarray.TestGeometryGetitem object at 0x7fe964ebe810>
data = <LineArray>
[          Line([0.0, 1.0]), Line([1.0, 2.0, 3.0, 4.0]),
                       None,         Line([-1.0, ...                       None,
         Line([-1.0, -2.0]),                   Line([])]
Length: 100, dtype: line[float64]

    def test_getitem_boolean_na_treated_as_false(self, data):
        # https://github.com/pandas-dev/pandas/issues/31503
        mask = pd.array(np.zeros(data.shape, dtype="bool"), dtype="boolean")
        mask[:2] = pd.NA
        mask[2:4] = True

>       result = data[mask]

../../miniconda3/envs/default/lib/python3.7/site-packages/pandas/tests/extension/base/getitem.py:167:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <LineArray>
[          Line([0.0, 1.0]), Line([1.0, 2.0, 3.0, 4.0]),
                       None,         Line([-1.0, ...                       None,
         Line([-1.0, -2.0]),                   Line([])]
Length: 100, dtype: line[float64]
item = <BooleanArray>
[ <NA>,  <NA>,  True,  True, False, False, False, False, False, False, False,
 False, False, False, Fal...alse,
 False, False, False, False, False, False, False, False, False, False, False,
 False]
Length: 100, dtype: boolean

    def __getitem__(self, item):
        err_msg = ("Only integers, slices and integer or boolean"
                   "arrays are valid indices.")
        if isinstance(item, Integral):
            item = int(item)
            if item < -len(self) or item >= len(self):
                raise IndexError("{item} is out of bounds".format(item=item))
            else:
                # Convert negative item index
                if item < 0:
                    item += len(self)

                value = self.data[item].as_py()
                if value is not None:
                    return self._element_type(value, self.numpy_dtype)
                else:
                    return None
        elif isinstance(item, slice):
            if item.step is None or item.step == 1:
                # pyarrow only supports slice with step of 1
                return self.__class__(self.data[item], dtype=self.dtype)
            else:
                selected_indices = np.arange(len(self))[item]
                return self.take(selected_indices, allow_fill=False)
        elif isinstance(item, Iterable):
            if isinstance(item, (np.ndarray, ExtensionArray)):
                # Leave numpy and pandas arrays alone
                kind = item.dtype.kind
            else:
                item = pd.array(item)
                kind = item.dtype.kind

            if len(item) == 0:
                return self.take([], allow_fill=False)
            elif kind == 'b':
                # Check mask length is compatible
                if len(item) != len(self):
                    raise IndexError(
                        "boolean mask length ({}) doesn't match array length ({})"
                        .format(len(item), len(self))
                    )

                # check for NA values
                if any(pd.isna(item)):
                    raise ValueError(
>                       "Cannot mask with a boolean indexer containing NA values"
                    )
E                   ValueError: Cannot mask with a boolean indexer containing NA values

spatialpandas/geometry/base.py:385: ValueError
Installed packages
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20200225.2           he1b5a44_0    conda-forge
adal                      1.2.4              pyh9f0ad1d_0    conda-forge
aiohttp                   3.6.2            py37h516909a_0    conda-forge
alabaster                 0.7.12                     py_0    conda-forge
alembic                   1.4.2              pyh9f0ad1d_0    conda-forge
appdirs                   1.4.3                      py_1    conda-forge
arrow-cpp                 0.17.1          py37h1234567_10_cpu    conda-forge
asn1crypto                1.3.0            py37hc8dfbb8_1    conda-forge
astroid                   2.3.3                    pypi_0    pypi
async-timeout             3.0.1                   py_1000    conda-forge
async_generator           1.10                       py_0    conda-forge
attrs                     19.3.0                     py_0    conda-forge
aws-logging-handlers      2.0.3                    pypi_0    pypi
aws-sdk-cpp               1.7.164              hc831370_1    conda-forge
awscli                    1.18.94          py37hc8dfbb8_0    conda-forge
babel                     2.8.0                      py_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
bandit                    1.6.2                    py37_0    conda-forge
beautifulsoup4            4.9.1            py37hc8dfbb8_0    conda-forge
black                     19.10b0                  py37_0    conda-forge
bleach                    3.1.5              pyh9f0ad1d_0    conda-forge
blessed                   1.17.5                   pypi_0    pypi
blinker                   1.4                        py_1    conda-forge
blosc                     1.19.0               he1b5a44_0    conda-forge
bokeh                     1.4.0                      py_0    bokeh
boost-cpp                 1.72.0               h8e57a91_0    conda-forge
boto                      2.49.0                     py_0    conda-forge
boto3                     1.14.17            pyh9f0ad1d_0    conda-forge
botocore                  1.17.17            pyh9f0ad1d_0    conda-forge
bottleneck                1.3.2            py37h03ebfcd_1    conda-forge
brotli                    1.0.7             he1b5a44_1002    conda-forge
brotlipy                  0.7.0           py37h8f50634_1000    conda-forge
bzip2                     1.0.8                h516909a_2    conda-forge
c-ares                    1.15.0            h516909a_1001    conda-forge
ca-certificates           2020.6.20            hecda079_0    conda-forge
cachetools                4.1.1                      py_0    conda-forge
cairo                     1.16.0            hcf35c78_1003    conda-forge
cartopy                   0.17.0          py37hd759880_1006    conda-forge
certifi                   2020.6.20        py37hc8dfbb8_0    conda-forge
certipy                   0.1.3                      py_0    conda-forge
cffi                      1.14.0           py37hd463f26_0    conda-forge
cfgv                      3.1.0                      py_0    conda-forge
cfitsio                   3.470                hb60a0a2_2    conda-forge
cftime                    1.2.0            py37h03ebfcd_1    conda-forge
chardet                   3.0.4           py37hc8dfbb8_1006    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.5.0                      py_0    conda-forge
cloudpickle               1.4.1                      py_0    conda-forge
colorama                  0.4.3                      py_0    conda-forge
colorcet                  2.0.2                      py_0    pyviz/label/dev
conda-lock                0.2.2                    pypi_0    pypi
configurable-http-proxy   4.2.1           node13_he01fd0c_0    conda-forge
coverage                  5.2              py37h8f50634_0    conda-forge
croniter                  0.3.30                     py_0    conda-forge
cryptography              2.9.2            py37hb09aad4_0    conda-forge
cssselect                 1.1.0                      py_0    conda-forge
curl                      7.68.0               hf8cf82a_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.20          py37h3340039_0    conda-forge
cytoolz                   0.10.1           py37h516909a_0    conda-forge
dask                      2.20.0                     py_0    conda-forge
dask-core                 2.20.0                     py_0    conda-forge
dask-glm                  0.2.0                      py_1    conda-forge
dask-kubernetes           0.10.1                     py_0    conda-forge
dask-labextension         2.0.1                    pypi_0    pypi
dask-ml                   1.5.0                      py_0    conda-forge
datashader                0.11.0                     py_0    pyviz/label/dev
datashape                 0.5.4                      py_1    conda-forge
dbus                      1.13.6               he372182_0    conda-forge
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
descartes                 1.1.0                      py_4    conda-forge
dill                      0.3.2              pyh9f0ad1d_0    conda-forge
distlib                   0.3.1              pyh9f0ad1d_0    conda-forge
distributed               2.20.0           py37hc8dfbb8_0    conda-forge
docker-py                 4.2.2            py37hc8dfbb8_0    conda-forge
docker-pycreds            0.4.0                      py_0    conda-forge
docutils                  0.15.2                   py37_0    conda-forge
dodgy                     0.2.1                    pypi_0    pypi
editdistance              0.5.3            py37h3340039_0    conda-forge
entrypoints               0.3             py37hc8dfbb8_1001    conda-forge
et_xmlfile                1.0.1                   py_1001    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
fastparquet               0.4.0            py37h03ebfcd_0    pyviz
filelock                  3.0.12             pyh9f0ad1d_0    conda-forge
fiona                     1.8.6            py37hf242f0b_3    conda-forge
flake8                    3.8.3              pyh9f0ad1d_0    conda-forge
flit                      2.3.0                      py_0    conda-forge
flit-core                 2.3.0                      py_0    conda-forge
fontconfig                2.13.1            h86ecdb6_1001    conda-forge
freetype                  2.10.2               he06d7ca_0    conda-forge
freexl                    1.0.5             h14c3975_1002    conda-forge
fribidi                   1.0.9                h516909a_0    conda-forge
fsspec                    0.7.4                      py_0    conda-forge
funcsigs                  1.0.2                      py_3    conda-forge
futures-compat            1.0                       py3_0    conda-forge
fuzzywuzzy                0.17.0                     py_0    conda-forge
fzf                       0.21.1               h375a9b1_1    conda-forge
gdal                      2.4.1            py37h5f563d9_8    conda-forge
geographiclib             1.50                       py_0    conda-forge
geopandas                 0.6.3                      py_0    conda-forge
geopy                     2.0.0              pyh9f0ad1d_0    conda-forge
geos                      3.7.2                he1b5a44_2    conda-forge
geotiff                   1.4.3             hb6868eb_1001    conda-forge
geoviews                  1.7.0                      py_0    pyviz/label/dev
geoviews-core             1.7.0                      py_0    pyviz/label/dev
gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
gflags                    2.2.2             he1b5a44_1002    conda-forge
giflib                    5.1.7                h516909a_1    conda-forge
gitdb                     4.0.5                      py_0    conda-forge
gitpython                 3.1.3                      py_0    conda-forge
glib                      2.65.0               h6f030ca_0    conda-forge
glog                      0.4.0                h49b9bf7_3    conda-forge
google-auth               1.17.2                     py_0    conda-forge
googlemaps                2.5.1                      py_0    conda-forge
graphite2                 1.3.13            he1b5a44_1001    conda-forge
graphviz                  2.42.3               h0511662_0    conda-forge
grpc-cpp                  1.30.0               h9ea6770_0    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
h5py                      2.10.0          nompi_py37h513d04c_102    conda-forge
harfbuzz                  2.4.0                h9f30f68_3    conda-forge
haversine                 2.2.0                      py_0    conda-forge
hdf4                      4.2.13            hf30be14_1003    conda-forge
hdf5                      1.10.5          nompi_h3c11f04_1104    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
holoviews                 1.13.3                     py_0    pyviz/label/dev
html5lib                  1.1                pyh9f0ad1d_0    conda-forge
hvplot                    0.6.0                      py_0    pyviz/label/dev
hypothesis                5.19.0                     py_0    conda-forge
icu                       64.2                 he1b5a44_1    conda-forge
identify                  1.4.21             pyh9f0ad1d_0    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
imagesize                 1.2.0                      py_0    conda-forge
importlib-metadata        1.7.0            py37hc8dfbb8_0    conda-forge
importlib_metadata        1.7.0                         0    conda-forge
importnb                  0.6.1            py37hc8dfbb8_2    conda-forge
ipykernel                 5.3.1            py37h43977f1_0    conda-forge
ipython                   7.16.1           py37h43977f1_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.5.1                      py_0    conda-forge
isort                     5.0.4            py37hc8dfbb8_0    conda-forge
jdcal                     1.4.1                      py_0    conda-forge
jedi                      0.17.1           py37hc8dfbb8_0    conda-forge
jinja2                    2.11.2             pyh9f0ad1d_0    conda-forge
jmespath                  0.10.0             pyh9f0ad1d_0    conda-forge
joblib                    0.16.0                     py_0    conda-forge
jpeg                      9d                   h516909a_0    conda-forge
json-c                    0.13.1            hbfbb72e_1002    conda-forge
json5                     0.9.4              pyh9f0ad1d_0    conda-forge
jsonschema                3.2.0            py37hc8dfbb8_1    conda-forge
jupyter                   1.0.0                      py_2    conda-forge
jupyter-archive           0.5.5                      py_0    conda-forge
jupyter-server-proxy      1.5.0                      py_0    conda-forge
jupyter_bokeh             1.1.1                      py_0    bokeh
jupyter_client            5.3.4                    py37_1    conda-forge
jupyter_console           6.1.0                      py_1    conda-forge
jupyter_core              4.6.3            py37hc8dfbb8_1    conda-forge
jupyter_telemetry         0.0.5                      py_0    conda-forge
jupyterhub                1.1.0                    py37_2    conda-forge
jupyterhub-base           1.1.0                    py37_2    conda-forge
jupyterlab                1.2.7                      py_0    conda-forge
jupyterlab-git            0.10.0                   pypi_0    pypi
jupyterlab-s3-browser     0.4.1                    pypi_0    pypi
jupyterlab_code_formatter 1.3.1                      py_0    conda-forge
jupyterlab_server         1.2.0                      py_0    conda-forge
kartothek                 3.10.0                     py_0    conda-forge
kealib                    1.4.13               hec59c27_0    conda-forge
kiwisolver                1.2.0            py37h99015e2_0    conda-forge
krb5                      1.16.4               h2fd8d38_0    conda-forge
kubernetes                1.18.5                        0    conda-forge
kubernetes-client         1.18.5               haa36a5b_0    conda-forge
kubernetes-node           1.18.5               haa36a5b_0    conda-forge
kubernetes-server         1.18.5               haa36a5b_0    conda-forge
kubernetes_asyncio        11.3.0             pyh9f0ad1d_0    conda-forge
lazy-object-proxy         1.4.3                    pypi_0    pypi
ld_impl_linux-64          2.34                 h53a641e_5    conda-forge
libblas                   3.8.0               17_openblas    conda-forge
libcblas                  3.8.0               17_openblas    conda-forge
libclang                  9.0.1           default_hde54327_0    conda-forge
libcurl                   7.68.0               hda55be3_0    conda-forge
libdap4                   3.20.4               hd3bb157_0    conda-forge
libedit                   3.1.20191231         h46ee950_0    conda-forge
libevent                  2.1.10               hcdb4288_1    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
libgdal                   2.4.1                heae24aa_8    conda-forge
libgfortran-ng            7.5.0                hdf63c60_6    conda-forge
libiconv                  1.15              h516909a_1006    conda-forge
libkml                    1.3.0             hb574062_1011    conda-forge
liblapack                 3.8.0               17_openblas    conda-forge
libllvm8                  8.0.1                hc9558a2_0    conda-forge
libllvm9                  9.0.1                he513fc3_1    conda-forge
libnetcdf                 4.6.2             h303dfb8_1003    conda-forge
libopenblas               0.3.10               h5ec1e0e_0    conda-forge
libpng                    1.6.37               hed695b0_1    conda-forge
libpostal                 1.1.0                h38d415c_2    conda-forge
libpq                     11.5                 hd9ab2ff_2    conda-forge
libprotobuf               3.12.3               h8b12597_0    conda-forge
libsodium                 1.0.17               h516909a_0    conda-forge
libspatialindex           1.9.3                he1b5a44_3    conda-forge
libspatialite             4.3.0a            h79dc798_1030    conda-forge
libssh2                   1.8.2                h22169c7_2    conda-forge
libstdcxx-ng              9.2.0                hdf63c60_2    conda-forge
libtiff                   4.1.0                hc7e4089_6    conda-forge
libtool                   2.4.6             h14c3975_1002    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libuv                     1.34.0               h516909a_0    conda-forge
libwebp-base              1.1.0                h516909a_3    conda-forge
libxcb                    1.13              h14c3975_1002    conda-forge
libxkbcommon              0.10.0               he1b5a44_0    conda-forge
libxml2                   2.9.10               hee79883_0    conda-forge
libxslt                   1.1.33               h31b3aaa_0    conda-forge
llvm-openmp               10.0.0               hc9558a2_0    conda-forge
llvmlite                  0.31.0           py37h5202443_1    conda-forge
locket                    0.2.0                      py_2    conda-forge
loguru                    0.5.0            py37hc8dfbb8_0    conda-forge
lxml                      4.5.1            py37he3881c9_0    conda-forge
lz4-c                     1.9.2                he1b5a44_1    conda-forge
lzo                       2.10              h14c3975_1000    conda-forge
mako                      1.1.0                      py_0    conda-forge
markdown                  3.2.2                      py_0    conda-forge
markupsafe                1.1.1            py37h8f50634_1    conda-forge
marshmallow               3.6.1                      py_0    conda-forge
marshmallow-oneofschema   2.0.1                      py_0    conda-forge
matplotlib                3.2.2                         1    conda-forge
matplotlib-base           3.2.2            py37h30547a4_0    conda-forge
mccabe                    0.6.1                      py_1    conda-forge
milksnake                 0.1.5                      py_0    conda-forge
mistune                   0.8.4           py37h8f50634_1001    conda-forge
mock                      4.0.2            py37hc8dfbb8_0    conda-forge
more-itertools            8.4.0                      py_0    conda-forge
msgpack-python            1.0.0            py37h99015e2_1    conda-forge
multidict                 4.7.5            py37h8f50634_1    conda-forge
multipledispatch          0.6.0                      py_0    conda-forge
munch                     2.5.0                      py_0    conda-forge
mypy                      0.782                      py_0    conda-forge
mypy_extensions           0.4.3            py37hc8dfbb8_1    conda-forge
namegenerator             1.0.6                    pypi_0    pypi
nb_conda_kernels          2.2.3                    py37_0    conda-forge
nbconvert                 5.6.1            py37hc8dfbb8_1    conda-forge
nbdime                    1.1.0                    pypi_0    pypi
nbformat                  5.0.7                      py_0    conda-forge
nbval                     0.9.5                      py_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
netcdf4                   1.5.1.2          py37h73a1b54_1    conda-forge
nodeenv                   1.4.0              pyh9f0ad1d_0    conda-forge
nodejs                    13.13.0              hf5d1a2b_0    conda-forge
notebook                  6.0.3            py37hc8dfbb8_1    conda-forge
nspr                      4.26                 he1b5a44_0    conda-forge
nss                       3.47                 he751ad9_0    conda-forge
numba                     0.48.0           py37hb3f55d8_0    conda-forge
numexpr                   2.7.1            py37h0da4684_1    conda-forge
numpy                     1.18.5           py37h8960a57_0    conda-forge
oauthlib                  3.0.1                      py_0    conda-forge
olefile                   0.46                       py_0    conda-forge
openjpeg                  2.3.1                h981e76c_3    conda-forge
openpyxl                  3.0.4                      py_0    conda-forge
openssl                   1.1.1g               h516909a_0    conda-forge
owslib                    0.19.2                     py_0    conda-forge
packaging                 20.4               pyh9f0ad1d_0    conda-forge
pamela                    1.0.0                      py_0    conda-forge
pandas                    1.0.5            py37h0da4684_0    conda-forge
pandoc                    2.10                          0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
panel                     0.8.3                      py_0    pyviz/label/dev
pango                     1.42.4               h7062337_4    conda-forge
param                     1.10.0a3                   py_0    pyviz/label/dev
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.7.0              pyh9f0ad1d_0    conda-forge
partd                     1.1.0                      py_0    conda-forge
pathspec                  0.8.0              pyh9f0ad1d_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pbr                       5.4.5              pyh9f0ad1d_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pendulum                  2.1.0            py37hc8dfbb8_1    conda-forge
pep8-naming               0.4.1                    pypi_0    pypi
pexpect                   4.8.0            py37hc8dfbb8_1    conda-forge
pickleshare               0.7.5           py37hc8dfbb8_1001    conda-forge
pillow                    7.2.0            py37h718be6c_0    conda-forge
pip                       20.1.1                     py_1    conda-forge
pixman                    0.38.0            h516909a_1003    conda-forge
pluggy                    0.13.1           py37hc8dfbb8_2    conda-forge
polygon-geohasher         0.0.1                    pypi_0    pypi
poppler                   0.67.0               h14e79db_8    conda-forge
poppler-data              0.4.9                         1    conda-forge
postal                    1.1.8                    pypi_0    pypi
postgresql                11.5                 hc63931a_2    conda-forge
pre-commit                2.6.0            py37hc8dfbb8_0    conda-forge
prefect                   0.12.2                     py_0    conda-forge
proj4                     5.2.0             he1b5a44_1006    conda-forge
prometheus_client         0.8.0              pyh9f0ad1d_0    conda-forge
prompt-toolkit            3.0.5                      py_1    conda-forge
prompt_toolkit            3.0.5                         1    conda-forge
prospector                1.2.0                    pypi_0    pypi
psutil                    5.7.0            py37h8f50634_1    conda-forge
pthread-stubs             0.4               h14c3975_1001    conda-forge
ptyprocess                0.6.0                   py_1001    conda-forge
py                        1.9.0              pyh9f0ad1d_0    conda-forge
pyarrow                   0.17.1          py37h1234567_10_cpu    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pycodestyle               2.4.0                    pypi_0    pypi
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pyct                      0.4.7a3                    py_0    pyviz/label/dev
pyct-core                 0.4.7a3                    py_0    pyviz/label/dev
pycurl                    7.43.0.5         py37hce7685b_2    conda-forge
pydantic                  1.5.1            py37h8f50634_0    conda-forge
pydocstyle                5.0.2                      py_0    conda-forge
pyepsg                    0.4.0                      py_0    conda-forge
pyflakes                  2.2.0              pyh9f0ad1d_0    conda-forge
pygments                  2.6.1                      py_0    conda-forge
pyjwt                     1.7.1                      py_0    conda-forge
pykdtree                  1.3.1           py37h03ebfcd_1003    conda-forge
pylama                    7.7.1                    pypi_0    pypi
pylint                    2.4.4                    pypi_0    pypi
pylint-celery             0.3                      pypi_0    pypi
pylint-django             2.0.12                   pypi_0    pypi
pylint-flask              0.6                      pypi_0    pypi
pylint-plugin-utils       0.6                      pypi_0    pypi
pyopenssl                 19.1.0                     py_1    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyproj                    1.9.6           py37h516909a_1002    conda-forge
pyqt                      5.12.3           py37h8685d9f_3    conda-forge
pyqt5-sip                 4.19.18                  pypi_0    pypi
pyqtchart                 5.12                     pypi_0    pypi
pyqtwebengine             5.12.1                   pypi_0    pypi
pyroma                    2.6                      pypi_0    pypi
pyrsistent                0.16.0           py37h8f50634_0    conda-forge
pyshp                     2.1.0                      py_0    conda-forge
pysocks                   1.7.1            py37hc8dfbb8_1    conda-forge
pytables                  3.6.1            py37h9f153d1_1    conda-forge
pytest                    5.4.3            py37hc8dfbb8_0    conda-forge
pytest-cov                2.10.0             pyh9f0ad1d_0    conda-forge
pytest-vcr                1.0.2                    pypi_0    pypi
python                    3.7.6           cpython_h8356626_6    conda-forge
python-blosc              1.9.1            py37h0da4684_0    conda-forge
python-box                4.2.3                      py_0    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-docx               0.8.10                   pypi_0    pypi
python-dotenv             0.14.0             pyh9f0ad1d_0    conda-forge
python-editor             1.0.4                      py_0    conda-forge
python-geohash            0.8.5            py37h3340039_1    conda-forge
python-graphviz           0.14               pyh9f0ad1d_0    conda-forge
python-json-logger        0.1.11                     py_0    conda-forge
python-kubernetes         11.0.0           py37hc8dfbb8_0    conda-forge
python-levenshtein        0.12.0          py37h516909a_1001    conda-forge
python-slugify            4.0.1              pyh9f0ad1d_0    conda-forge
python-snappy             0.5.4            py37h7cfaab3_1    conda-forge
python_abi                3.7                     1_cp37m    pyviz
pytoml                    0.1.21                     py_0    conda-forge
pytz                      2020.1             pyh9f0ad1d_0    conda-forge
pytzdata                  2019.3                     py_0    conda-forge
pyviz_comms               0.7.6                      py_0    pyviz/label/dev
pyyaml                    5.1.2            py37h516909a_0    conda-forge
pyzmq                     19.0.1           py37hac76be4_0    conda-forge
qgrid                     1.3.0                    pypi_0    pypi
qt                        5.12.5               hd8c4c69_1    conda-forge
qtconsole                 4.7.5              pyh9f0ad1d_0    conda-forge
qtpy                      1.9.0                      py_0    conda-forge
re2                       2020.07.06           he1b5a44_0    conda-forge
readline                  8.0                  hf8c457e_0    conda-forge
regex                     2020.6.8         py37h8f50634_0    conda-forge
requests                  2.24.0             pyh9f0ad1d_0    conda-forge
requests-oauthlib         1.2.0                      py_0    conda-forge
requests_download         0.1.2                      py_1    conda-forge
requirements-detector     0.6                      pypi_0    pypi
retrying                  1.3.3                    pypi_0    pypi
rsa                       3.4.2                      py_1    conda-forge
rtree                     0.9.4            py37h8526d28_1    conda-forge
ruamel.yaml               0.16.6           py37h8f50634_1    conda-forge
ruamel.yaml.clib          0.2.0            py37h8f50634_1    conda-forge
s3fs                      0.4.2                      py_0    conda-forge
s3transfer                0.3.3            py37hc8dfbb8_1    conda-forge
scikit-learn              0.23.1           py37h8a51577_0    conda-forge
scipy                     1.5.0            py37ha3d9a3c_0    conda-forge
seaborn                   0.10.1                        1    conda-forge
seaborn-base              0.10.1                     py_1    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setoptconf                0.2.0                    pypi_0    pypi
setuptools                49.1.0           py37hc8dfbb8_0    conda-forge
shapely                   1.6.4           py37hec07ddf_1006    conda-forge
simpervisor               0.3                        py_1    conda-forge
simplejson                3.17.0           py37h8f50634_1    conda-forge
simplekv                  0.14.1             pyh9f0ad1d_0    conda-forge
singleton-decorator       1.0.0                    pypi_0    pypi
six                       1.15.0             pyh9f0ad1d_0    conda-forge
smartystreets-python-sdk  4.4.1                    pypi_0    pypi
smmap                     3.0.4              pyh9f0ad1d_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
snowballstemmer           2.0.0                      py_0    conda-forge
sortedcontainers          2.2.2              pyh9f0ad1d_0    conda-forge
soupsieve                 2.0.1            py37hc8dfbb8_0    conda-forge
spatialpandas             0.3.4.post2+gefdabe5          pypi_0    pypi
sphinx                    3.1.2                      py_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    1.0.3                      py_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.4                      py_0    conda-forge
sqlalchemy                1.3.18           py37h8f50634_0    conda-forge
sqlite                    3.32.3               hcee41ef_0    conda-forge
statsmodels               0.11.1           py37h8f50634_2    conda-forge
stevedore                 1.30.1                     py_0    conda-forge
storefact                 0.10.0                     py_0    conda-forge
tabulate                  0.8.7              pyh9f0ad1d_0    conda-forge
tblib                     1.6.0                      py_0    conda-forge
terminado                 0.8.3            py37hc8dfbb8_1    conda-forge
testpath                  0.4.4                      py_0    conda-forge
text-unidecode            1.3                        py_0    conda-forge
threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
thrift                    0.11.0          py37he1b5a44_1001    conda-forge
thrift-cpp                0.13.0               h62aa4f2_2    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
toml                      0.10.1             pyh9f0ad1d_0    conda-forge
toolz                     0.10.0                     py_0    conda-forge
tornado                   6.0.4            py37h8f50634_1    conda-forge
tqdm                      4.47.0             pyh9f0ad1d_0    conda-forge
traitlets                 4.3.3            py37hc8dfbb8_1    conda-forge
typed-ast                 1.4.1            py37h516909a_0    conda-forge
typesentry                0.2.7                    pypi_0    pypi
typing_extensions         3.7.4.2                    py_0    conda-forge
tzcode                    2020a                h516909a_0    conda-forge
unidecode                 1.1.1                      py_0    conda-forge
uritools                  3.0.0            py37hc8dfbb8_1    conda-forge
urllib3                   1.25.9                     py_0    conda-forge
urlquote                  1.1.4            py37hc8dfbb8_1    conda-forge
vcrpy                     4.0.2                      py_0    conda-forge
virtualenv                20.0.20          py37hc8dfbb8_1    conda-forge
watermark                 2.0.2                      py_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_0    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          0.57.0           py37hc8dfbb8_1    conda-forge
wheel                     0.34.2                     py_1    conda-forge
widgetsnbextension        3.5.1                    py37_0    conda-forge
wrapt                     1.11.2                   pypi_0    pypi
xarray                    0.15.1                     py_0    conda-forge
xerces-c                  3.2.2             h8412b87_1004    conda-forge
xlrd                      1.2.0                      py_0    conda-forge
xlsxwriter                1.2.9              pyh9f0ad1d_0    conda-forge
xlwt                      1.3.0                      py_1    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.9                h516909a_0    conda-forge
xorg-libxau               1.0.9                h14c3975_0    conda-forge
xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxpm               3.5.13               h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-libxt                1.1.5             h516909a_1003    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.5                h516909a_0    conda-forge
yaml                      0.1.7             h14c3975_1001    conda-forge
yapf                      0.29.0                     py_0    conda-forge
yarl                      1.3.0           py37h516909a_1000    conda-forge
zeromq                    4.3.2                he1b5a44_2    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.1.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1006    conda-forge
zstandard                 0.14.0           py37h3340039_0    conda-forge
zstd                      1.4.4                h6597ccf_3    conda-forge

Error on running overview.ipynb a Kubernetes cluster

I was running the notebook on the ocean.pangeo.io deployment on GCP. I modified the cluster as bellow:

from dask.distributed import Client, progress

from dask_kubernetes import KubeCluster
cluster = KubeCluster(n_workers=16)
cluster

client = Client(cluster)
client

When I ran the next cell of code:
`import pandas as pd
import dask.dataframe as dd
reps = 10000

Large geopandas GeoDataFrame

cities_large_gp = pd.concat([cities_gp] * reps, axis=0)

Large spatialpandas GeoDataFrame

cities_large_df = pd.concat([cities_df] * reps, axis=0)

Large spatialpandas DaskGeoDataFrame with 16 partitions

cities_large_ddf = dd.from_pandas(cities_large_df, npartitions=16).persist()

Precompute the partition-level spatial index

cities_large_ddf.partition_sindex`

I got the following errors:


KilledWorker Traceback (most recent call last)
in
12
13 # Precompute the partition-level spatial index
---> 14 cities_large_ddf.partition_sindex

/srv/conda/envs/notebook/lib/python3.7/site-packages/spatialpandas/dask.py in partition_sindex(self)
145 geometry._partition_bounds = self._partition_bounds[geometry_name]
146
--> 147 self._partition_sindex[geometry.name] = geometry.partition_sindex
148 self._partition_bounds[geometry_name] = geometry.partition_bounds
149 return self._partition_sindex[geometry_name]

/srv/conda/envs/notebook/lib/python3.7/site-packages/spatialpandas/dask.py in partition_sindex(self)
66 def partition_sindex(self):
67 if self._partition_sindex is None:
---> 68 self._partition_sindex = HilbertRtree(self.partition_bounds.values)
69 return self._partition_sindex
70

/srv/conda/envs/notebook/lib/python3.7/site-packages/spatialpandas/dask.py in partition_bounds(self)
48 if self._partition_bounds is None:
49 self._partition_bounds = self.map_partitions(
---> 50 lambda s: pd.DataFrame(
51 [s.total_bounds], columns=['x0', 'y0', 'x1', 'y1']
52 )

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask/base.py in compute(self, **kwargs)
163 dask.base.compute
164 """
--> 165 (result,) = compute(self, traverse=False, **kwargs)
166 return result
167

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask/base.py in compute(*args, **kwargs)
434 keys = [x.dask_keys() for x in collections]
435 postcomputes = [x.dask_postcompute() for x in collections]
--> 436 results = schedule(dsk, keys, **kwargs)
437 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
438

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
2571 should_rejoin = False
2572 try:
-> 2573 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
2574 finally:
2575 for f in futures.values():

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
1871 direct=direct,
1872 local_worker=local_worker,
-> 1873 asynchronous=asynchronous,
1874 )
1875

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
766 else:
767 return sync(
--> 768 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
769 )
770

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
332 if error[0]:
333 typ, exc, tb = error[0]
--> 334 raise exc.with_traceback(tb)
335 else:
336 return result[0]

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/utils.py in f()
316 if callback_timeout is not None:
317 future = gen.with_timeout(timedelta(seconds=callback_timeout), future)
--> 318 result[0] = yield future
319 except Exception as exc:
320 error[0] = sys.exc_info()

/srv/conda/envs/notebook/lib/python3.7/site-packages/tornado/gen.py in run(self)
733
734 try:
--> 735 value = future.result()
736 except Exception:
737 exc_info = sys.exc_info()

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
1727 exc = CancelledError(key)
1728 else:
-> 1729 raise exception.with_traceback(traceback)
1730 raise exc
1731 if errors == "skip":

KilledWorker: ("('from_pandas-0638bd8b066e3449279e712ce7cd8a44', 9)", <Worker 'tcp://10.32.7.6:41973', memory: 0, processing: 4>)

DaskDataFrame to DaskGeoDataFrame

Is your feature request related to a problem? Please describe.

Cross posting from discourse. Not sure if spatialpandas members watch that
https://discourse.holoviz.org/t/spatialpandas-daskdataframe-to-daskgeodataframe/706

In spatialpandas, dd.from_pandas() automatically creates a DaskGeoDataFrame from a GeoDataFrame. I have a DaskDataFrame with point data as coordinates and would like to create a DaskGeoDataFrame without having to load the data in memory? If this is not yet possible, can you think of a good way to implement this functionality? Thanks!

Describe the solution you'd like

A utility method like

from_dask_dataframe(existing_dask_dataframe, geometry=geom_col_name)

Describe alternatives you've considered

Nothing that has worked.

So far I'm able to map partitions and create GeoSeries

import spatialpandas as spd
def to_gs(df):
    points = spd.geometry.PointArray((df['x'], df['y']))
    spgs = spd.GeoSeries(points, index=df.index)
    return spgs

ddf = dd.read_parquet(bigfiles)
spdgeom = ddf.map_partitions(to_gs)
gsddf = ddf.assign(geometry=spdgeom)
gsddf.dtypes

-------------------
id            string
x             float64
y             float64
geometry       point[float64]

Seems like a step away from DaskGeoDataFrame, but can't find a way. Any pointers are appreciated.

Additional context

ValueError: Keyword 'validate_schema' is not yet supported with the new Dataset API

Hey guys,

I'm having this ValueError that I suppose #92 intended to fix, but there still is a validate_schema=False in the following snippet of code that should probably have been removed:

# Load using pyarrow to handle parquet files and directories across filesystems
df = pq.ParquetDataset(
path,
filesystem=filesystem,
validate_schema=False,
**engine_kwargs,
**kwargs,
).read(columns=columns).to_pandas()

Replacing:

validate_schema=False,

by:

#validate_schema=False,
use_legacy_dataset=False,

as it was done in #92 indeed did the trick for me.

ALL software version info

macOS
Python 3.10
geopandas 0.12.0
spatialpandas 0.4.6
pyarrow 11.0.0
dask 2023.2.0

Complete, minimal, self-contained example code that reproduces the issue

Here is my case:

1) Convert a geopandas GeoDataFrame into a spatialpandas one
2) Convert it into a DaskGeoDataFrame
3) Write it to disk with to_parquet_dask()
4) Read it with read_parquet_dask()
5) ... any call on the DaskGeoDataFrame that will call compute() triggers the bug

Stack traceback and/or browser JavaScript console output

Traceback (most recent call last):
  File "MyProj/romain/dashboard/app.py", line 31, in <module>
    edges_ddf["highway"] = edges_ddf["highway"].cat.as_known()
  File "MyProj/testenv/lib/python3.10/site-packages/dask/dataframe/categorical.py", line 218, in as_known
    categories = self._property_map("categories").unique().compute(**kwargs)
  File "MyProj/testenv/lib/python3.10/site-packages/dask/base.py", line 314, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "MyProj/testenv/lib/python3.10/site-packages/dask/base.py", line 599, in compute
    results = schedule(dsk, keys, **kwargs)
  File "MyProj/testenv/lib/python3.10/site-packages/dask/threaded.py", line 89, in get
    results = get_async(
  File "MyProj/testenv/lib/python3.10/site-packages/dask/local.py", line 511, in get_async
    raise_exception(exc, tb)
  File "MyProj/testenv/lib/python3.10/site-packages/dask/local.py", line 319, in reraise
    raise exc
  File "MyProj/testenv/lib/python3.10/site-packages/dask/local.py", line 224, in execute_task
    result = _execute_task(task, data)
  File "MyProj/testenv/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "MyProj/testenv/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "MyProj/testenv/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "MyProj/testenv/lib/python3.10/site-packages/dask/utils.py", line 72, in apply
    return func(*args, **kwargs)
  File "MyProj/testenv/lib/python3.10/site-packages/spatialpandas/io/parquet.py", line 175, in read_parquet
    df = pq.ParquetDataset(
  File "MyProj/testenv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 1763, in __new__
    return _ParquetDatasetV2(
  File "MyProj/testenv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 2394, in __init__
    raise ValueError(
ValueError: Keyword 'validate_schema' is not yet supported with the new Dataset API

Numba Deprecation Warning

Importing spatialpandas I encountered the following warning:

lib/python3.7/site-packages/spatialpandas/spatialindex/rtree.py:263: NumbaDeprecationWarning: The 'numba.jitclass' decorator has moved to 'numba.experimental.jitclass' to better reflect the experimental nature of the functionality. Please update your imports to accommodate this change and see http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#change-of-jitclass-location for the time frame.
@jitclass(_numbartree_spec)

Versions:

numba                     0.50.1           py37h0da4684_0    conda-forge
spatialpandas             0.3.4.post2+gefdabe5          pypi_0    pypi

Spatialpandas was installed from git master branch.

From the linked documentation...
Simply update imports as follows:
Change from numba import jitclass to from numba.experimental import jitclass

Process for creating new release

@philippjfr, @jbednar,

I tried to create a new release, as I thought I had done before, but the build failed. It appears that the version information param is finding is not what I intended.
What is the correct process for creating a release?

Thanks in advance for your guidance and support.

HilbertRtree query is slower than Rtree and Pygeos

For me the sindex is one of the most interesting parts of this project. I was able to run some tests using only geometry bounds and numpy arrays.

from spatialpandas.spatialindex import HilbertRtree

The creation of the HilbertRtree is vectorized and relatively fast, however queries are not vectorized and really slow (2x slower compared to Rtree and 10x slower compared to Pygeos). Is there anything that can be done to improve scalability and decrease time complexity?

Also, could someone please explain how the sindex is distributed, more specifically how the partition_sindex works. For example, how is the tree creation distributed? Each right dataframe partition has its own sindex? Is it necessary to query each right sindex with all left index partitions (n^2)? Is the sindex serializable and thread safe?

re: solicitation for maintainers

Responding per this tweet, I'd be happy to contribute to this project. I'm a long-time user of scientific/geospatial Python stack, and have plenty of experience contributing to open source Python projects. I wouldn't call myself a Dask power user, but I'm familiar enough with the library to have bumped up against the limitations of the Dask + Geopandas workflow, which is why I'm eager to contribute to this effort. I'm not sure that Cython-oriented tasks are where I would be most effective from the start, as I don't have much development experience with Cython, but hopefully this can be a good excuse for me to gain some.

Filesystem not yet consistent

ALL software version info

Current build
Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20200225.1           he1b5a44_2    conda-forge
adal                      1.2.2                      py_0    conda-forge
aiohttp                   3.6.2            py37h516909a_0    conda-forge
alabaster                 0.7.12                     py_0    conda-forge
alembic                   1.4.2              pyh9f0ad1d_0    conda-forge
appdirs                   1.4.3                      py_1    conda-forge
arrow-cpp                 0.16.0           py37h989c1cb_2    conda-forge
asn1crypto                1.3.0                    py37_0    conda-forge
astroid                   2.3.3                    pypi_0    pypi
async-timeout             3.0.1                   py_1000    conda-forge
async_generator           1.10                       py_0    conda-forge
attrs                     19.3.0                     py_0    conda-forge
aws-logging-handlers      2.0.3                    pypi_0    pypi
aws-sdk-cpp               1.7.164              hc831370_1    conda-forge
babel                     2.8.0                      py_0    conda-forge
backcall                  0.1.0                      py_0    conda-forge
bandit                    1.6.2                    py37_0    conda-forge
beautifulsoup4            4.8.2            py37hc8dfbb8_1    conda-forge
black                     19.10b0                  py37_0    conda-forge
bleach                    3.1.4              pyh9f0ad1d_0    conda-forge
blinker                   1.4                        py_1    conda-forge
blosc                     1.17.1               he1b5a44_0    conda-forge
bokeh                     1.4.0                      py_0    bokeh
boost-cpp                 1.72.0               h8e57a91_0    conda-forge
boto                      2.49.0                     py_0    conda-forge
boto3                     1.12.36            pyh9f0ad1d_0    conda-forge
botocore                  1.15.36            pyh9f0ad1d_0    conda-forge
bottleneck                1.3.2            py37h03ebfcd_1    conda-forge
brotli                    1.0.7             he1b5a44_1001    conda-forge
bzip2                     1.0.8                h516909a_2    conda-forge
c-ares                    1.15.0            h516909a_1001    conda-forge
ca-certificates           2020.4.5.1           hecc5488_0    conda-forge
cachetools                3.1.1                      py_0    conda-forge
cairo                     1.16.0            hcf35c78_1003    conda-forge
cartopy                   0.17.0          py37h6078e7d_1013    conda-forge
certifi                   2020.4.5.1       py37hc8dfbb8_0    conda-forge
certipy                   0.1.3                      py_0    conda-forge
cffi                      1.14.0           py37hd463f26_0    conda-forge
cfgv                      3.1.0                      py_0    conda-forge
cfitsio                   3.470                hb60a0a2_2    conda-forge
cftime                    1.1.1.2          py37h03ebfcd_0    conda-forge
chardet                   3.0.4           py37hc8dfbb8_1006    conda-forge
click                     7.1.1              pyh8c360ce_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.5.0                      py_0    conda-forge
cloudpickle               1.2.2                      py_1    conda-forge
colorama                  0.4.3                    pypi_0    pypi
colorcet                  2.0.2                      py_0    pyviz
configurable-http-proxy   4.2.1           node13_he01fd0c_0    conda-forge
coverage                  5.0.4            py37h8f50634_0    conda-forge
croniter                  0.3.30                     py_0    conda-forge
cryptography              2.8              py37hb09aad4_2    conda-forge
cssselect                 1.1.0                      py_0    conda-forge
curl                      7.68.0               hf8cf82a_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.16          py37h3340039_0    conda-forge
cytoolz                   0.10.1           py37h516909a_0    conda-forge
dask                      2.14.0                     py_0    conda-forge
dask-core                 2.14.0                     py_0    conda-forge
dask-kubernetes           0.10.1                     py_0    conda-forge
dask-labextension         2.0.1                    pypi_0    pypi
datashader                0.10.0                     py_0    pyviz
datashape                 0.5.4                      py_1    conda-forge
datum                     0.1.5                    pypi_0    pypi
dbus                      1.13.6               he372182_0    conda-forge
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
descartes                 1.1.0                      py_4    conda-forge
distributed               2.14.0           py37hc8dfbb8_0    conda-forge
docker-py                 4.2.0            py37hc8dfbb8_0    conda-forge
docker-pycreds            0.4.0                      py_0    conda-forge
docutils                  0.15.2                   py37_0    conda-forge
dodgy                     0.2.1                    pypi_0    pypi
editdistance              0.5.3            py37h3340039_0    conda-forge
entrypoints               0.3             py37hc8dfbb8_1001    conda-forge
et_xmlfile                1.0.1                   py_1001    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
fiona                     1.8.13           py37h900e953_0    conda-forge
flake8                    3.7.9            py37hc8dfbb8_1    conda-forge
fontconfig                2.13.1            h86ecdb6_1001    conda-forge
freetype                  2.10.1               he06d7ca_0    conda-forge
freexl                    1.0.5             h14c3975_1002    conda-forge
fribidi                   1.0.9                h516909a_0    conda-forge
fsspec                    0.7.1                      py_0    conda-forge
funcsigs                  1.0.2                      py_3    conda-forge
futures-compat            1.0                       py3_0    conda-forge
gdal                      3.0.4            py37h4b180d9_3    conda-forge
geographiclib             1.50                       py_0    conda-forge
geopandas                 0.7.0                      py_1    conda-forge
geopy                     1.21.0                     py_0    conda-forge
geos                      3.8.1                he1b5a44_0    conda-forge
geotiff                   1.5.1                hcbe54f9_9    conda-forge
geoviews                  1.6.6                      py_0    pyviz
geoviews-core             1.6.6                      py_0    pyviz
gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
gflags                    2.2.2             he1b5a44_1002    conda-forge
giflib                    5.2.1                h516909a_2    conda-forge
gitdb                     4.0.2                      py_0    conda-forge
gitpython                 3.1.0                      py_0    conda-forge
glib                      2.58.3          py37he00f558_1003    conda-forge
glog                      0.4.0                he1b5a44_1    conda-forge
google-auth               1.12.0             pyh9f0ad1d_0    conda-forge
googlemaps                2.5.1                      py_0    conda-forge
graphite2                 1.3.13            he1b5a44_1001    conda-forge
graphviz                  2.42.3               h0511662_0    conda-forge
grpc-cpp                  1.28.1               h7397029_0    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
h5py                      2.10.0          nompi_py37h513d04c_102    conda-forge
harfbuzz                  2.4.0                h9f30f68_3    conda-forge
haversine                 2.2.0                      py_0    conda-forge
hdf4                      4.2.13            hf30be14_1003    conda-forge
hdf5                      1.10.5          nompi_h3c11f04_1104    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
holoviews                 1.12.7                     py_0    pyviz
html5lib                  1.0.1                      py_0    conda-forge
hvplot                    0.5.2                      py_0    pyviz
hypothesis                5.8.0                      py_0    conda-forge
icu                       64.2                 he1b5a44_1    conda-forge
identify                  1.4.14             pyh9f0ad1d_0    conda-forge
idna                      2.9                        py_1    conda-forge
imageio                   2.8.0                      py_0    conda-forge
imagesize                 1.2.0                      py_0    conda-forge
importlib-metadata        1.6.0            py37hc8dfbb8_0    conda-forge
importlib_metadata        1.6.0                         0    conda-forge
importnb                  0.6.0                    py37_0    conda-forge
ipykernel                 5.2.0            py37h43977f1_1    conda-forge
ipython                   7.13.0           py37hc8dfbb8_2    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.5.1                      py_0    conda-forge
isort                     4.3.21           py37hc8dfbb8_1    conda-forge
jdcal                     1.4.1                      py_0    conda-forge
jedi                      0.16.0           py37hc8dfbb8_1    conda-forge
jinja2                    2.11.1                     py_0    conda-forge
jmespath                  0.9.5                      py_0    conda-forge
joblib                    0.14.1                     py_0    conda-forge
jpeg                      9c                h14c3975_1001    conda-forge
json-c                    0.13.1            h14c3975_1001    conda-forge
json5                     0.9.0                      py_0    conda-forge
jsonschema                3.2.0            py37hc8dfbb8_1    conda-forge
jupyter                   1.0.0                      py_2    conda-forge
jupyter-archive           0.5.5                      py_0    conda-forge
jupyter-server-proxy      1.3.2                      py_0    conda-forge
jupyter_bokeh             1.1.1                      py_0    bokeh
jupyter_client            5.3.4                    py37_1    conda-forge
jupyter_console           6.1.0                      py_1    conda-forge
jupyter_core              4.6.3            py37hc8dfbb8_1    conda-forge
jupyter_telemetry         0.0.5                      py_0    conda-forge
jupyterhub                1.1.0                    py37_2    conda-forge
jupyterhub-base           1.1.0                    py37_2    conda-forge
jupyterlab                1.2.7                      py_0    conda-forge
jupyterlab-git            0.10.0                   pypi_0    pypi
jupyterlab_server         1.1.0                      py_0    conda-forge
kartothek                 3.8.1                      py_0    conda-forge
kealib                    1.4.13               hec59c27_0    conda-forge
kiwisolver                1.2.0            py37h99015e2_0    conda-forge
krb5                      1.16.4               h2fd8d38_0    conda-forge
kubernetes                1.16.3               ha4a5029_0    conda-forge
kubernetes_asyncio        11.1.0             pyh8c360ce_0    conda-forge
lazy-object-proxy         1.4.3                    pypi_0    pypi
ld_impl_linux-64          2.34                 h53a641e_0    conda-forge
libblas                   3.8.0               16_openblas    conda-forge
libcblas                  3.8.0               16_openblas    conda-forge
libclang                  9.0.1           default_hde54327_0    conda-forge
libcurl                   7.68.0               hda55be3_0    conda-forge
libdap4                   3.20.4               hd3bb157_0    conda-forge
libedit                   3.1.20170329      hf8c457e_1001    conda-forge
libevent                  2.1.10               h72c5cf5_0    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
libgdal                   3.0.4                h94bbfbd_3    conda-forge
libgfortran-ng            7.3.0                hdf63c60_5    conda-forge
libiconv                  1.15              h516909a_1006    conda-forge
libkml                    1.3.0             hb574062_1011    conda-forge
liblapack                 3.8.0               16_openblas    conda-forge
libllvm8                  8.0.1                hc9558a2_0    conda-forge
libllvm9                  9.0.1                hc9558a2_0    conda-forge
libnetcdf                 4.7.4           nompi_h9f9fd6a_101    conda-forge
libopenblas               0.3.9                h5ec1e0e_0    conda-forge
libpng                    1.6.37               hed695b0_1    conda-forge
libpq                     12.2                 hae5116b_0    conda-forge
libprotobuf               3.11.4               h8b12597_0    conda-forge
libsodium                 1.0.17               h516909a_0    conda-forge
libspatialindex           1.9.3                he1b5a44_3    conda-forge
libspatialite             4.3.0a            heb269f5_1037    conda-forge
libssh2                   1.8.2                h22169c7_2    conda-forge
libstdcxx-ng              9.2.0                hdf63c60_2    conda-forge
libtiff                   4.1.0                hc3755c2_3    conda-forge
libtool                   2.4.6             h14c3975_1002    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libuv                     1.34.0               h516909a_0    conda-forge
libwebp                   1.0.2                h56121f0_5    conda-forge
libxcb                    1.13              h14c3975_1002    conda-forge
libxkbcommon              0.10.0               he1b5a44_0    conda-forge
libxml2                   2.9.10               hee79883_0    conda-forge
libxslt                   1.1.33               h31b3aaa_0    conda-forge
llvm-openmp               9.0.1                hc9558a2_2    conda-forge
llvmlite                  0.31.0           py37h5202443_1    conda-forge
locket                    0.2.0                      py_2    conda-forge
loguru                    0.4.1                    py37_0    conda-forge
lxml                      4.5.0            py37he3881c9_1    conda-forge
lz4-c                     1.8.3             he1b5a44_1001    conda-forge
lzo                       2.10              h14c3975_1000    conda-forge
mako                      1.1.0                      py_0    conda-forge
markdown                  3.2.1                      py_0    conda-forge
markupsafe                1.1.1            py37h8f50634_1    conda-forge
marshmallow               3.5.0                      py_0    conda-forge
marshmallow-oneofschema   2.0.1                      py_0    conda-forge
matplotlib                3.2.1                         0    conda-forge
matplotlib-base           3.2.1            py37h30547a4_0    conda-forge
mccabe                    0.6.1                      py_1    conda-forge
milksnake                 0.1.5                      py_0    conda-forge
mistune                   0.8.4           py37h516909a_1000    conda-forge
mock                      3.0.5            py37hc8dfbb8_1    conda-forge
more-itertools            8.2.0                      py_0    conda-forge
msgpack-python            1.0.0            py37h99015e2_1    conda-forge
multidict                 4.7.5            py37h516909a_0    conda-forge
multipledispatch          0.6.0                      py_0    conda-forge
munch                     2.5.0                      py_0    conda-forge
mypy                      0.770                      py_0    conda-forge
mypy_extensions           0.4.3            py37hc8dfbb8_1    conda-forge
nb_conda_kernels          2.2.3                    py37_0    conda-forge
nbconvert                 5.6.1                    py37_0    conda-forge
nbdime                    1.1.0                    pypi_0    pypi
nbformat                  5.0.4                      py_0    conda-forge
nbval                     0.9.5                      py_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
netcdf4                   1.5.3           nompi_py37hec16513_103    conda-forge
networkx                  2.4                        py_1    conda-forge
nodeenv                   1.3.5                      py_0    conda-forge
nodejs                    13.10.1              hf5d1a2b_0    conda-forge
notebook                  6.0.3                    py37_0    conda-forge
nspr                      4.25                 he1b5a44_0    conda-forge
nss                       3.47                 he751ad9_0    conda-forge
numba                     0.48.0           py37hb3f55d8_0    conda-forge
numexpr                   2.7.1            py37h0da4684_1    conda-forge
numpy                     1.18.1           py37h8960a57_1    conda-forge
oauthlib                  3.0.1                      py_0    conda-forge
olefile                   0.46                       py_0    conda-forge
openjpeg                  2.3.1                h981e76c_3    conda-forge
openpyxl                  3.0.3                      py_0    conda-forge
openssl                   1.1.1f               h516909a_0    conda-forge
owslib                    0.19.2                     py_1    conda-forge
packaging                 20.1                       py_0    conda-forge
pamela                    1.0.0                      py_0    conda-forge
pandas                    1.0.3            py37h0da4684_0    conda-forge
pandoc                    2.9.2                         0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
panel                     0.8.3                      py_0    pyviz
pango                     1.42.4               h7062337_4    conda-forge
param                     1.9.3                      py_0    pyviz
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.6.2                      py_0    conda-forge
partd                     1.1.0                      py_0    conda-forge
pathspec                  0.7.0                      py_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pbr                       5.4.2                      py_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pendulum                  2.1.0            py37hc8dfbb8_1    conda-forge
pep8-naming               0.4.1                    pypi_0    pypi
pexpect                   4.8.0            py37hc8dfbb8_1    conda-forge
pickleshare               0.7.5           py37hc8dfbb8_1001    conda-forge
pillow                    7.1.1            py37h718be6c_0    conda-forge
pip                       20.0.2                     py_2    conda-forge
pixman                    0.38.0            h516909a_1003    conda-forge
pluggy                    0.13.0                   py37_0    conda-forge
polygon-geohasher         0.0.1                    pypi_0    pypi
poppler                   0.67.0               h14e79db_8    conda-forge
poppler-data              0.4.9                         1    conda-forge
postgresql                12.2                 hf1211e9_0    conda-forge
pre-commit                2.2.0            py37hc8dfbb8_1    conda-forge
prefect                   0.10.0                     py_0    conda-forge
proj                      6.3.1                hc80f0dc_1    conda-forge
prometheus_client         0.7.1                      py_0    conda-forge
prompt-toolkit            3.0.5                      py_0    conda-forge
prompt_toolkit            3.0.5                         0    conda-forge
prospector                1.2.0                    pypi_0    pypi
psutil                    5.7.0            py37h8f50634_1    conda-forge
pthread-stubs             0.4               h14c3975_1001    conda-forge
ptyprocess                0.6.0                   py_1001    conda-forge
py                        1.8.1                      py_0    conda-forge
pyarrow                   0.16.0           py37hd02d5f2_2    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pycodestyle               2.4.0                    pypi_0    pypi
pycparser                 2.20                       py_0    conda-forge
pyct                      0.4.6                      py_0    pyviz
pyct-core                 0.4.6                      py_0    pyviz
pycurl                    7.43.0.5         py37h16ce93b_0    conda-forge
pydocstyle                5.0.2                      py_0    conda-forge
pyepsg                    0.4.0                      py_0    conda-forge
pyflakes                  2.1.1                      py_0    conda-forge
pygments                  2.6.1                      py_0    conda-forge
pyjwt                     1.7.1                      py_0    conda-forge
pykdtree                  1.3.1           py37h03ebfcd_1003    conda-forge
pylama                    7.7.1                    pypi_0    pypi
pylint                    2.4.4                    pypi_0    pypi
pylint-celery             0.3                      pypi_0    pypi
pylint-django             2.0.12                   pypi_0    pypi
pylint-flask              0.6                      pypi_0    pypi
pylint-plugin-utils       0.6                      pypi_0    pypi
pyopenssl                 19.1.0                     py_1    conda-forge
pyparsing                 2.4.6                      py_0    conda-forge
pyproj                    2.6.0            py37heba2c01_0    conda-forge
pyqt                      5.12.3           py37hcca6a23_1    conda-forge
pyqt5-sip                 4.19.18                  pypi_0    pypi
pyqtwebengine             5.12.1                   pypi_0    pypi
pyroma                    2.6                      pypi_0    pypi
pyrsistent                0.16.0           py37h8f50634_0    conda-forge
pyshp                     2.1.0                      py_0    conda-forge
pysocks                   1.7.1            py37hc8dfbb8_1    conda-forge
pytables                  3.6.1            py37h9f153d1_1    conda-forge
pytest                    5.4.1            py37hc8dfbb8_0    conda-forge
pytest-cov                2.8.1                      py_0    conda-forge
python                    3.7.6           h8356626_5_cpython    conda-forge
python-blosc              1.9.0            py37h0da4684_0    conda-forge
python-box                4.2.2                      py_0    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-docx               0.8.10                   pypi_0    pypi
python-dotenv             0.12.0                     py_0    conda-forge
python-editor             1.0.4                      py_0    conda-forge
python-geohash            0.8.5            py37he1b5a44_0    conda-forge
python-graphviz           0.13.2                     py_0    conda-forge
python-json-logger        0.1.11                     py_0    conda-forge
python-kubernetes         10.1.0           py37hc8dfbb8_1    conda-forge
python-slugify            4.0.0                      py_0    conda-forge
python-snappy             0.5.4            py37h7cfaab3_1    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pytz                      2019.3                     py_0    conda-forge
pytzdata                  2019.3                     py_0    conda-forge
pyviz_comms               0.7.4                      py_0    pyviz
pywavelets                1.1.1            py37hc1659b7_0    conda-forge
pyyaml                    5.1.2            py37h516909a_0    conda-forge
pyzmq                     19.0.0           py37hac76be4_1    conda-forge
qt                        5.12.5               hd8c4c69_1    conda-forge
qtconsole                 4.7.2              pyh9f0ad1d_0    conda-forge
qtpy                      1.9.0                      py_0    conda-forge
re2                       2020.04.01           he1b5a44_0    conda-forge
readline                  8.0                  hf8c457e_0    conda-forge
regex                     2020.4.4         py37h8f50634_0    conda-forge
requests                  2.23.0             pyh8c360ce_2    conda-forge
requests-oauthlib         1.2.0                      py_0    conda-forge
requirements-detector     0.6                      pypi_0    pypi
retrying                  1.3.3                      py_2    conda-forge
rsa                       4.0                        py_0    conda-forge
rtree                     0.9.4            py37h8526d28_1    conda-forge
ruamel.yaml               0.16.6           py37h8f50634_1    conda-forge
ruamel.yaml.clib          0.2.0            py37h8f50634_1    conda-forge
s3fs                      0.4.2                      py_0    conda-forge
s3transfer                0.3.3            py37hc8dfbb8_1    conda-forge
scikit-image              0.16.2           py37hb3f55d8_0    conda-forge
scikit-learn              0.22.2.post1     py37hcdab131_0    conda-forge
scipy                     1.4.1            py37ha3d9a3c_2    conda-forge
seaborn                   0.10.0                     py_1    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setoptconf                0.2.0                    pypi_0    pypi
setuptools                46.1.3           py37hc8dfbb8_0    conda-forge
shapely                   1.7.0            py37hc88ce51_3    conda-forge
simpervisor               0.3                        py_1    conda-forge
simplejson                3.17.0           py37h516909a_0    conda-forge
simplekv                  0.14.0                     py_0    conda-forge
six                       1.14.0                     py_1    conda-forge
smartystreets-python-sdk  4.5.0                    pypi_0    pypi
smmap                     3.0.1                      py_0    conda-forge
snappy                    1.1.8                he1b5a44_1    conda-forge
snowballstemmer           2.0.0                      py_0    conda-forge
sortedcontainers          2.1.0                      py_0    conda-forge
soupsieve                 1.9.4            py37hc8dfbb8_1    conda-forge
spatialpandas             0.3.5                      py_0    pyviz
sphinx                    3.0.0                      py_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    1.0.3                      py_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.4                      py_0    conda-forge
sqlalchemy                1.3.15           py37h8f50634_1    conda-forge
sqlite                    3.30.1               hcee41ef_0    conda-forge
statsmodels               0.11.1           py37h8f50634_1    conda-forge
stevedore                 1.30.1                     py_0    conda-forge
storefact                 0.10.0                     py_0    conda-forge
tabulate                  0.8.7              pyh9f0ad1d_0    conda-forge
tbb                       2018.0.5             h2d50403_0    conda-forge
tblib                     1.6.0                      py_0    conda-forge
terminado                 0.8.3            py37hc8dfbb8_1    conda-forge
testpath                  0.4.4                      py_0    conda-forge
text-unidecode            1.2                        py_0    conda-forge
thrift-cpp                0.13.0               h62aa4f2_2    conda-forge
tiledb                    1.7.0                hcde45ca_2    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
toml                      0.10.0                     py_0    conda-forge
toolz                     0.10.0                     py_0    conda-forge
tornado                   6.0.4            py37h8f50634_1    conda-forge
tqdm                      4.45.0             pyh9f0ad1d_0    conda-forge
traitlets                 4.3.3            py37hc8dfbb8_1    conda-forge
typed-ast                 1.4.1            py37h516909a_0    conda-forge
typing_extensions         3.7.4.1          py37hc8dfbb8_3    conda-forge
tzcode                    2019a             h516909a_1002    conda-forge
unidecode                 1.1.1                      py_0    conda-forge
uritools                  3.0.0            py37hc8dfbb8_1    conda-forge
urllib3                   1.25.7           py37hc8dfbb8_1    conda-forge
urlquote                  1.1.4            py37hc8dfbb8_1    conda-forge
virtualenv                16.7.5                     py_0    conda-forge
watermark                 2.0.2                      py_0    conda-forge
wcwidth                   0.1.9              pyh9f0ad1d_0    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          0.57.0           py37hc8dfbb8_1    conda-forge
wheel                     0.34.2                     py_1    conda-forge
widgetsnbextension        3.5.1                    py37_0    conda-forge
wrapt                     1.11.2                   pypi_0    pypi
xarray                    0.15.1                     py_0    conda-forge
xerces-c                  3.2.2             h8412b87_1004    conda-forge
xlrd                      1.2.0                      py_0    conda-forge
xlsxwriter                1.2.8                      py_0    conda-forge
xlwt                      1.3.0                      py_1    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.9                h516909a_0    conda-forge
xorg-libxau               1.0.9                h14c3975_0    conda-forge
xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxpm               3.5.13               h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-libxt                1.1.5             h516909a_1003    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.4             h516909a_1002    conda-forge
yaml                      0.1.7             h14c3975_1001    conda-forge
yapf                      0.29.0                     py_0    conda-forge
yarl                      1.3.0           py37h516909a_1000    conda-forge
zeromq                    4.3.2                he1b5a44_2    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.1.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1006    conda-forge
zstandard                 0.13.0           py37he1b5a44_0    conda-forge
zstd                      1.4.4                h3b9ef0a_2    conda-forge

Previous build
Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20200225.1           he1b5a44_2    conda-forge
adal                      1.2.2                      py_0    conda-forge
aiohttp                   3.6.2            py37h516909a_0    conda-forge
alabaster                 0.7.12                     py_0    conda-forge
alembic                   1.4.2              pyh9f0ad1d_0    conda-forge
appdirs                   1.4.3                      py_1    conda-forge
arrow-cpp                 0.16.0           py37hd8d096e_1    conda-forge
asn1crypto                1.3.0                    py37_0    conda-forge
astroid                   2.3.3                    pypi_0    pypi
async-timeout             3.0.1                   py_1000    conda-forge
async_generator           1.10                       py_0    conda-forge
attrs                     19.3.0                     py_0    conda-forge
aws-logging-handlers      2.0.3                    pypi_0    pypi
aws-sdk-cpp               1.7.164              h1f8afcc_0    conda-forge
babel                     2.8.0                      py_0    conda-forge
backcall                  0.1.0                      py_0    conda-forge
bandit                    1.6.2                    py37_0    conda-forge
beautifulsoup4            4.8.2            py37hc8dfbb8_1    conda-forge
black                     19.10b0                  py37_0    conda-forge
bleach                    3.1.4              pyh9f0ad1d_0    conda-forge
blinker                   1.4                        py_1    conda-forge
blosc                     1.17.1               he1b5a44_0    conda-forge
bokeh                     1.4.0                      py_0    bokeh
boost-cpp                 1.72.0               h8e57a91_0    conda-forge
boto                      2.49.0                     py_0    conda-forge
boto3                     1.12.29            pyh9f0ad1d_0    conda-forge
botocore                  1.15.29            pyh9f0ad1d_0    conda-forge
bottleneck                1.3.2            py37h03ebfcd_1    conda-forge
brotli                    1.0.7             he1b5a44_1001    conda-forge
bzip2                     1.0.8                h516909a_2    conda-forge
c-ares                    1.15.0            h516909a_1001    conda-forge
ca-certificates           2019.11.28           hecc5488_0    conda-forge
cachetools                3.1.1                      py_0    conda-forge
cairo                     1.16.0            hcf35c78_1003    conda-forge
cartopy                   0.17.0          py37hd759880_1006    conda-forge
certifi                   2019.11.28       py37hc8dfbb8_1    conda-forge
certipy                   0.1.3                      py_0    conda-forge
cffi                      1.14.0           py37hd463f26_0    conda-forge
cfgv                      3.1.0                      py_0    conda-forge
cfitsio                   3.470                hb60a0a2_2    conda-forge
cftime                    1.1.1.2          py37h03ebfcd_0    conda-forge
chardet                   3.0.4           py37hc8dfbb8_1006    conda-forge
click                     7.1.1              pyh8c360ce_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.5.0                      py_0    conda-forge
cloudpickle               1.2.2                      py_1    conda-forge
colorama                  0.4.3                    pypi_0    pypi
colorcet                  2.0.2                      py_0    pyviz/label/dev
configurable-http-proxy   4.2.0           node13_he01fd0c_2    conda-forge
coverage                  5.0.4            py37h8f50634_0    conda-forge
croniter                  0.3.30                     py_0    conda-forge
cryptography              2.8              py37hb09aad4_2    conda-forge
cssselect                 1.1.0                      py_0    conda-forge
curl                      7.68.0               hf8cf82a_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.16          py37h3340039_0    conda-forge
cytoolz                   0.10.1           py37h516909a_0    conda-forge
dask                      2.13.0                     py_0    conda-forge
dask-core                 2.13.0                     py_0    conda-forge
dask-kubernetes           0.10.1                     py_0    conda-forge
dask-labextension         2.0.1                    pypi_0    pypi
datashader                0.10.0                     py_0    pyviz/label/dev
datashape                 0.5.4                      py_1    conda-forge
datum                     0.1.5                    pypi_0    pypi
dbus                      1.13.6               he372182_0    conda-forge
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
descartes                 1.1.0                      py_4    conda-forge
distributed               2.13.0           py37hc8dfbb8_0    conda-forge
docker-py                 4.2.0                    py37_0    conda-forge
docker-pycreds            0.4.0                      py_0    conda-forge
docutils                  0.15.2                   py37_0    conda-forge
dodgy                     0.2.1                    pypi_0    pypi
editdistance              0.5.3            py37he1b5a44_0    conda-forge
entrypoints               0.3             py37hc8dfbb8_1001    conda-forge
et_xmlfile                1.0.1                   py_1001    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
fiona                     1.8.6            py37hf242f0b_3    conda-forge
flake8                    3.7.9            py37hc8dfbb8_1    conda-forge
fontconfig                2.13.1            h86ecdb6_1001    conda-forge
freetype                  2.10.1               he06d7ca_0    conda-forge
freexl                    1.0.5             h14c3975_1002    conda-forge
fribidi                   1.0.9                h516909a_0    conda-forge
fsspec                    0.6.3                      py_0    conda-forge
funcsigs                  1.0.2                      py_3    conda-forge
futures-compat            1.0                       py3_0    conda-forge
gdal                      2.4.1            py37h5f563d9_8    conda-forge
geographiclib             1.50                       py_0    conda-forge
geopandas                 0.6.3                      py_0    conda-forge
geopy                     1.21.0                     py_0    conda-forge
geos                      3.7.2                he1b5a44_2    conda-forge
geotiff                   1.4.3             hb6868eb_1001    conda-forge
geoviews                  1.7.0                      py_0    pyviz/label/dev
geoviews-core             1.7.0                      py_0    pyviz/label/dev
gettext                   0.19.8.1          hc5be6a0_1002    conda-forge
gflags                    2.2.2             he1b5a44_1002    conda-forge
giflib                    5.1.7                h516909a_1    conda-forge
gitdb                     4.0.2                      py_0    conda-forge
gitpython                 3.1.0                      py_0    conda-forge
glib                      2.58.3          py37he00f558_1003    conda-forge
glog                      0.4.0                he1b5a44_1    conda-forge
google-auth               1.11.2                     py_0    conda-forge
googlemaps                2.5.1                      py_0    conda-forge
graphite2                 1.3.13            he1b5a44_1001    conda-forge
graphviz                  2.42.3               h0511662_0    conda-forge
grpc-cpp                  1.27.3               h7397029_1    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
h5py                      2.10.0          nompi_py37h513d04c_102    conda-forge
harfbuzz                  2.4.0                h9f30f68_3    conda-forge
haversine                 2.2.0                      py_0    conda-forge
hdf4                      4.2.13            hf30be14_1003    conda-forge
hdf5                      1.10.5          nompi_h3c11f04_1104    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
holoviews                 1.13.1                     py_0    pyviz/label/dev
html5lib                  1.0.1                      py_0    conda-forge
hvplot                    0.5.3a1                    py_0    pyviz/label/dev
hypothesis                5.8.0                      py_0    conda-forge
icu                       64.2                 he1b5a44_1    conda-forge
identify                  1.4.11                     py_0    conda-forge
idna                      2.9                        py_1    conda-forge
imageio                   2.8.0                      py_0    conda-forge
imagesize                 1.2.0                      py_0    conda-forge
importlib-metadata        1.5.2            py37hc8dfbb8_0    conda-forge
importlib_metadata        1.5.2                         0    conda-forge
importnb                  0.6.0                    py37_0    conda-forge
ipykernel                 5.2.0            py37h43977f1_0    conda-forge
ipython                   7.13.0           py37hc8dfbb8_2    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.5.1                      py_0    conda-forge
isort                     4.3.21           py37hc8dfbb8_1    conda-forge
jdcal                     1.4.1                      py_0    conda-forge
jedi                      0.16.0           py37hc8dfbb8_1    conda-forge
jinja2                    2.11.1                     py_0    conda-forge
jmespath                  0.9.5                      py_0    conda-forge
joblib                    0.14.1                     py_0    conda-forge
jpeg                      9c                h14c3975_1001    conda-forge
json-c                    0.13.1            h14c3975_1001    conda-forge
json5                     0.9.0                      py_0    conda-forge
jsonschema                3.2.0            py37hc8dfbb8_1    conda-forge
jupyter                   1.0.0                      py_2    conda-forge
jupyter-archive           0.5.5                      py_0    conda-forge
jupyter-server-proxy      1.3.0                      py_0    conda-forge
jupyter_bokeh             1.1.1                      py_0    bokeh
jupyter_client            5.3.4                    py37_1    conda-forge
jupyter_console           6.1.0                      py_1    conda-forge
jupyter_core              4.6.3            py37hc8dfbb8_1    conda-forge
jupyter_telemetry         0.0.5                      py_0    conda-forge
jupyterhub                1.1.0                    py37_2    conda-forge
jupyterhub-base           1.1.0                    py37_2    conda-forge
jupyterlab                1.2.7                      py_0    conda-forge
jupyterlab-git            0.10.0                   pypi_0    pypi
jupyterlab_server         1.0.7                      py_0    conda-forge
kartothek                 3.8.1                      py_0    conda-forge
kealib                    1.4.12               hec59c27_0    conda-forge
kiwisolver                1.1.0            py37h99015e2_1    conda-forge
krb5                      1.16.4               h2fd8d38_0    conda-forge
kubernetes                1.16.3               ha4a5029_0    conda-forge
kubernetes_asyncio        11.1.0             pyh8c360ce_0    conda-forge
lazy-object-proxy         1.4.3                    pypi_0    pypi
ld_impl_linux-64          2.34                 h53a641e_0    conda-forge
libblas                   3.8.0               16_openblas    conda-forge
libcblas                  3.8.0               16_openblas    conda-forge
libclang                  9.0.1           default_hde54327_0    conda-forge
libcurl                   7.68.0               hda55be3_0    conda-forge
libdap4                   3.20.4               hd3bb157_0    conda-forge
libedit                   3.1.20170329      hf8c457e_1001    conda-forge
libevent                  2.1.10               h72c5cf5_0    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
libgdal                   2.4.1                heae24aa_8    conda-forge
libgfortran-ng            7.3.0                hdf63c60_5    conda-forge
libiconv                  1.15              h516909a_1006    conda-forge
libkml                    1.3.0             hb574062_1011    conda-forge
liblapack                 3.8.0               16_openblas    conda-forge
libllvm8                  8.0.1                hc9558a2_0    conda-forge
libllvm9                  9.0.1                hc9558a2_0    conda-forge
libnetcdf                 4.6.2             h303dfb8_1003    conda-forge
libopenblas               0.3.9                h5ec1e0e_0    conda-forge
libpng                    1.6.37               hed695b0_1    conda-forge
libpq                     11.5                 hd9ab2ff_2    conda-forge
libprotobuf               3.11.4               h8b12597_0    conda-forge
libsodium                 1.0.17               h516909a_0    conda-forge
libspatialindex           1.9.3                he1b5a44_3    conda-forge
libspatialite             4.3.0a            h79dc798_1030    conda-forge
libssh2                   1.8.2                h22169c7_2    conda-forge
libstdcxx-ng              9.2.0                hdf63c60_2    conda-forge
libtiff                   4.1.0                hc7e4089_6    conda-forge
libtool                   2.4.6             h14c3975_1002    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libuv                     1.34.0               h516909a_0    conda-forge
libwebp-base              1.1.0                h516909a_3    conda-forge
libxcb                    1.13              h14c3975_1002    conda-forge
libxkbcommon              0.10.0               he1b5a44_0    conda-forge
libxml2                   2.9.10               hee79883_0    conda-forge
libxslt                   1.1.33               h31b3aaa_0    conda-forge
llvm-openmp               9.0.1                hc9558a2_2    conda-forge
llvmlite                  0.31.0           py37h5202443_1    conda-forge
locket                    0.2.0                      py_2    conda-forge
loguru                    0.4.1                    py37_0    conda-forge
lxml                      4.5.0            py37he3881c9_1    conda-forge
lz4-c                     1.8.3             he1b5a44_1001    conda-forge
lzo                       2.10              h14c3975_1000    conda-forge
mako                      1.1.0                      py_0    conda-forge
markdown                  3.2.1                      py_0    conda-forge
markupsafe                1.1.1            py37h8f50634_1    conda-forge
marshmallow               3.5.0                      py_0    conda-forge
marshmallow-oneofschema   2.0.1                      py_0    conda-forge
matplotlib                3.2.1                         0    conda-forge
matplotlib-base           3.2.1            py37h30547a4_0    conda-forge
mccabe                    0.6.1                      py_1    conda-forge
milksnake                 0.1.5                      py_0    conda-forge
mistune                   0.8.4           py37h516909a_1000    conda-forge
mock                      3.0.5            py37hc8dfbb8_1    conda-forge
more-itertools            8.2.0                      py_0    conda-forge
msgpack-python            1.0.0            py37h99015e2_1    conda-forge
multidict                 4.7.5            py37h516909a_0    conda-forge
multipledispatch          0.6.0                      py_0    conda-forge
munch                     2.5.0                      py_0    conda-forge
mypy                      0.770                      py_0    conda-forge
mypy_extensions           0.4.3            py37hc8dfbb8_1    conda-forge
nb_conda_kernels          2.2.3                    py37_0    conda-forge
nbconvert                 5.6.1                    py37_0    conda-forge
nbdime                    1.1.0                    pypi_0    pypi
nbformat                  5.0.4                      py_0    conda-forge
nbval                     0.9.5                      py_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
netcdf4                   1.5.1.2          py37h73a1b54_1    conda-forge
networkx                  2.4                        py_1    conda-forge
nodeenv                   1.3.5                      py_0    conda-forge
nodejs                    13.10.1              hf5d1a2b_0    conda-forge
notebook                  6.0.3                    py37_0    conda-forge
nspr                      4.25                 he1b5a44_0    conda-forge
nss                       3.47                 he751ad9_0    conda-forge
numba                     0.48.0           py37hb3f55d8_0    conda-forge
numexpr                   2.7.1            py37h0da4684_1    conda-forge
numpy                     1.18.1           py37h8960a57_1    conda-forge
oauthlib                  3.0.1                      py_0    conda-forge
olefile                   0.46                       py_0    conda-forge
openjpeg                  2.3.1                h981e76c_3    conda-forge
openpyxl                  3.0.3                      py_0    conda-forge
openssl                   1.1.1d               h516909a_0    conda-forge
owslib                    0.19.2                     py_0    conda-forge
packaging                 20.1                       py_0    conda-forge
pamela                    1.0.0                      py_0    conda-forge
pandas                    1.0.3            py37h0da4684_0    conda-forge
pandoc                    2.9.2                         0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
panel                     0.8.3                      py_0    pyviz/label/dev
pango                     1.42.4               h7062337_3    conda-forge
param                     1.10.0a2                   py_0    pyviz/label/dev
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.6.2                      py_0    conda-forge
partd                     1.1.0                      py_0    conda-forge
pathspec                  0.7.0                      py_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pbr                       5.4.2                      py_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pendulum                  2.1.0            py37hc8dfbb8_1    conda-forge
pep8-naming               0.4.1                    pypi_0    pypi
pexpect                   4.8.0            py37hc8dfbb8_1    conda-forge
pickleshare               0.7.5           py37hc8dfbb8_1001    conda-forge
pillow                    7.0.0            py37h718be6c_1    conda-forge
pip                       20.0.2                     py_2    conda-forge
pixman                    0.38.0            h516909a_1003    conda-forge
pluggy                    0.13.0                   py37_0    conda-forge
polygon-geohasher         0.0.1                    pypi_0    pypi
poppler                   0.67.0               h14e79db_8    conda-forge
poppler-data              0.4.9                         1    conda-forge
postgresql                11.5                 hc63931a_2    conda-forge
pre-commit                2.2.0            py37hc8dfbb8_1    conda-forge
prefect                   0.9.8                      py_0    conda-forge
proj4                     5.2.0             he1b5a44_1006    conda-forge
prometheus_client         0.7.1                      py_0    conda-forge
prompt-toolkit            3.0.4                      py_0    conda-forge
prompt_toolkit            3.0.4                         0    conda-forge
prospector                1.2.0                    pypi_0    pypi
psutil                    5.7.0            py37h8f50634_1    conda-forge
pthread-stubs             0.4               h14c3975_1001    conda-forge
ptyprocess                0.6.0                   py_1001    conda-forge
py                        1.8.1                      py_0    conda-forge
pyarrow                   0.16.0           py37hd02d5f2_2    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pycodestyle               2.4.0                    pypi_0    pypi
pycparser                 2.20                       py_0    conda-forge
pyct                      0.4.6                      py_0    pyviz/label/dev
pyct-core                 0.4.6                      py_0    pyviz/label/dev
pycurl                    7.43.0.5         py37h16ce93b_0    conda-forge
pydocstyle                5.0.2                      py_0    conda-forge
pyepsg                    0.4.0                      py_0    conda-forge
pyflakes                  2.1.1                      py_0    conda-forge
pygments                  2.6.1                      py_0    conda-forge
pyjwt                     1.7.1                      py_0    conda-forge
pykdtree                  1.3.1           py37h03ebfcd_1003    conda-forge
pylama                    7.7.1                    pypi_0    pypi
pylint                    2.4.4                    pypi_0    pypi
pylint-celery             0.3                      pypi_0    pypi
pylint-django             2.0.12                   pypi_0    pypi
pylint-flask              0.6                      pypi_0    pypi
pylint-plugin-utils       0.6                      pypi_0    pypi
pyopenssl                 19.1.0                     py_1    conda-forge
pyparsing                 2.4.6                      py_0    conda-forge
pyproj                    1.9.6           py37h516909a_1002    conda-forge
pyqt                      5.12.3           py37hcca6a23_1    conda-forge
pyqt5-sip                 4.19.18                  pypi_0    pypi
pyqtwebengine             5.12.1                   pypi_0    pypi
pyroma                    2.6                      pypi_0    pypi
pyrsistent                0.16.0           py37h8f50634_0    conda-forge
pyshp                     2.1.0                      py_0    conda-forge
pysocks                   1.7.1            py37hc8dfbb8_1    conda-forge
pytables                  3.6.1            py37h9f153d1_1    conda-forge
pytest                    5.4.1            py37hc8dfbb8_0    conda-forge
pytest-cov                2.8.1                      py_0    conda-forge
python                    3.7.6           h357f687_4_cpython    conda-forge
python-blosc              1.8.3            py37hb3f55d8_0    conda-forge
python-box                4.2.2                      py_0    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-docx               0.8.10                   pypi_0    pypi
python-dotenv             0.12.0                     py_0    conda-forge
python-editor             1.0.4                      py_0    conda-forge
python-geohash            0.8.5            py37he1b5a44_0    conda-forge
python-graphviz           0.13.2                     py_0    conda-forge
python-json-logger        0.1.11                     py_0    conda-forge
python-kubernetes         10.1.0           py37hc8dfbb8_1    conda-forge
python-slugify            4.0.0                      py_0    conda-forge
python-snappy             0.5.4            py37h7cfaab3_1    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pytz                      2019.3                     py_0    conda-forge
pytzdata                  2019.3                     py_0    conda-forge
pyviz_comms               0.7.4                      py_0    pyviz/label/dev
pywavelets                1.1.1            py37hc1659b7_0    conda-forge
pyyaml                    5.1.2            py37h516909a_0    conda-forge
pyzmq                     19.0.0           py37hac76be4_1    conda-forge
qt                        5.12.5               hd8c4c69_1    conda-forge
qtconsole                 4.7.2              pyh9f0ad1d_0    conda-forge
qtpy                      1.9.0                      py_0    conda-forge
re2                       2020.03.03           he1b5a44_0    conda-forge
readline                  8.0                  hf8c457e_0    conda-forge
regex                     2020.2.20        py37h8f50634_1    conda-forge
requests                  2.23.0             pyh8c360ce_2    conda-forge
requests-oauthlib         1.2.0                      py_0    conda-forge
requirements-detector     0.6                      pypi_0    pypi
retrying                  1.3.3                      py_2    conda-forge
rsa                       4.0                        py_0    conda-forge
rtree                     0.9.4            py37h8526d28_1    conda-forge
ruamel.yaml               0.16.6           py37h8f50634_1    conda-forge
ruamel.yaml.clib          0.2.0            py37h8f50634_1    conda-forge
s3fs                      0.4.0                      py_0    conda-forge
s3transfer                0.3.3                    py37_0    conda-forge
scikit-image              0.16.2           py37hb3f55d8_0    conda-forge
scikit-learn              0.22.2.post1     py37hcdab131_0    conda-forge
scipy                     1.4.1            py37h921218d_0    conda-forge
seaborn                   0.10.0                     py_1    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setoptconf                0.2.0                    pypi_0    pypi
setuptools                46.1.3           py37hc8dfbb8_0    conda-forge
shapely                   1.6.4           py37hec07ddf_1006    conda-forge
simpervisor               0.3                        py_1    conda-forge
simplejson                3.17.0           py37h516909a_0    conda-forge
simplekv                  0.14.0                     py_0    conda-forge
six                       1.14.0                     py_1    conda-forge
smartystreets-python-sdk  4.4.1                    pypi_0    pypi
smmap                     3.0.1                      py_0    conda-forge
snappy                    1.1.8                he1b5a44_1    conda-forge
snowballstemmer           2.0.0                      py_0    conda-forge
sortedcontainers          2.1.0                      py_0    conda-forge
soupsieve                 1.9.4            py37hc8dfbb8_1    conda-forge
spatialpandas             0.3.5                      py_0    pyviz/label/dev
sphinx                    2.4.4                      py_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    1.0.3                      py_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.4                      py_0    conda-forge
sqlalchemy                1.3.15           py37h8f50634_1    conda-forge
sqlite                    3.30.1               hcee41ef_0    conda-forge
statsmodels               0.11.1           py37h8f50634_1    conda-forge
stevedore                 1.30.1                     py_0    conda-forge
storefact                 0.10.0                     py_0    conda-forge
tabulate                  0.8.7              pyh9f0ad1d_0    conda-forge
tblib                     1.6.0                      py_0    conda-forge
terminado                 0.8.3            py37hc8dfbb8_1    conda-forge
testpath                  0.4.4                      py_0    conda-forge
text-unidecode            1.2                        py_0    conda-forge
thrift-cpp                0.13.0               h62aa4f2_2    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
toml                      0.10.0                     py_0    conda-forge
toolz                     0.10.0                     py_0    conda-forge
tornado                   6.0.4            py37h8f50634_1    conda-forge
tqdm                      4.43.0                     py_0    conda-forge
traitlets                 4.3.3            py37hc8dfbb8_1    conda-forge
typed-ast                 1.4.1            py37h516909a_0    conda-forge
typing_extensions         3.7.4.1          py37hc8dfbb8_1    conda-forge
tzcode                    2019a             h516909a_1002    conda-forge
unidecode                 1.1.1                      py_0    conda-forge
uritools                  3.0.0            py37hc8dfbb8_1    conda-forge
urllib3                   1.25.7           py37hc8dfbb8_1    conda-forge
urlquote                  1.1.4            py37hc8dfbb8_1    conda-forge
virtualenv                16.7.5                     py_0    conda-forge
watermark                 2.0.2                      py_0    conda-forge
wcwidth                   0.1.9              pyh9f0ad1d_0    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          0.57.0           py37hc8dfbb8_1    conda-forge
wheel                     0.34.2                     py_1    conda-forge
widgetsnbextension        3.5.1                    py37_0    conda-forge
wrapt                     1.11.2                   pypi_0    pypi
xarray                    0.15.1                     py_0    conda-forge
xerces-c                  3.2.2             h8412b87_1004    conda-forge
xlrd                      1.2.0                      py_0    conda-forge
xlsxwriter                1.2.8                      py_0    conda-forge
xlwt                      1.3.0                      py_1    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.9                h516909a_0    conda-forge
xorg-libxau               1.0.9                h14c3975_0    conda-forge
xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxpm               3.5.13               h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-libxt                1.1.5             h516909a_1003    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.4             h516909a_1002    conda-forge
yaml                      0.1.7             h14c3975_1001    conda-forge
yapf                      0.29.0                     py_0    conda-forge
yarl                      1.3.0           py37h516909a_1000    conda-forge
zeromq                    4.3.2                he1b5a44_2    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.1.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1006    conda-forge
zstandard                 0.13.0           py37he1b5a44_0    conda-forge
zstd                      1.4.4                h3b9ef0a_2    conda-forge

Description of expected behavior and the observed behavior

pack_partitions_to_parquet does not work, error is encountered consistently in current build. Error does not occur in previous build.

ERROR:root:Filesystem not yet consistent
  Expected len: 2
  Found len: 2
  Missing: ['s3://spatial-temp/acabf4b3-c12a-4293-8bab-65ece4502d94-0001.parquet/part.0.parquet', 's3://spatial-temp/acabf4b3-c12a-4293-8bab-65ece4502d94-0001.parquet/part.1.parquet']
  Extras: ["['s3', 's3a']://spatial-temp/acabf4b3-c12a-4293-8bab-65ece4502d94-0001.parquet/part.0.parquet", "['s3', 's3a']://spatial-temp/acabf4b3-c12a-4293-8bab-65ece4502d94-0001.parquet/part.1.parquet"]

Based on the error, it appears the s3 protocol is now reported as a list ['s3', 's3a'] rather than a str 's3', and when it is prepended to the paths for comparison causes the comparison to fail.

Complete, minimal, self-contained example code that reproduces the issue

ddf.pack_partitions_to_parquet("s3://path")

Compatibility between spatialpandas.dask.DaskGeoDataFrame and HoloViews, GeoViews and hvplot?

Hi,

Apparently, there is no support for spatialpandas.dask.DaskGeoDataFrames in HoloViz plotting libraries other than datashader.

A simple example:

import holoviews as hv
hv.extension("bokeh")

# I have a single-row Parquet file I created using spatialpandas. 
# It only contains one row and one column, called geometry, which contains a MultiLine object. 
# I can easily read in as a spatialpandas.dask.DaskGeoDataFrame:
from spatialpandas.io import read_parquet_dask

df = read_parquet_dask("sample_spatialpandas_row.parquet")

# I now want to plot the multiline. For instance, with hvplot:
import hvplot.dask

df.hvplot()

# This raises an exception: Supplied data type DaskGeoDataFrame not understood

# And the same goes for instance for GeoViews:
import geoviews as gv

gv.Path(df)

# It does not recognizes it as a geodataframe-like element, raising: 
# ValueError: kdims: list length must be between 2 and 2 (inclusive)

The above is just an example, obviously. You can find the sample parquet file I used here (needs to be unzipped).

Having this working would allow datashading dinamically and directly from Holo/GeoViews/HvPlot using holoviews.operation.datashader on larger than memory datasets, backed by spatialpandas.dask.DaskGeoDataFrame, which would be amazing.

What do you think?

Geometry test failures in CI

We are seeing some geometry intersection test failures in CI. Originally we thought it was related to pandas 2.1 (issue #124, PR #125) but it isn't. What is happening is that our code is behaving the same as ever, but we compare some geometry results with that produced by geopandas and we have started getting slightly different results for intersection tests with a rectangle of zero area. The geopandas code for such either calls shapely or pygeos or handles the test itself.

I have a fix for this in progress.

Being able to instanciate empty GeoDataFrame

Is your feature request related to a problem? Please describe.

I have a workflow in which I process a geopandas GeoDataFrame in order to draw a map on a dashboard. At some point, I wish to make use of datashader reduction capabilities and thus, I convert my df into a spatialpandas GeoDataFrame. My dashboard allows to set filters which can lead my GeoDataFrame to be empty, and in that case, the conversion into a spatialpandas GeoDataFrame fails:

ValueError: A spatialpandas GeoDataFrame must contain at least one spatialpandas GeometryArray column

This led me to realize that spatialpandas does not allow the instanciation of an empty DataFrame, or at least not in the way pandas and geopandas allow it:

import pandas as pd
import geopandas as gpd
import spatialpandas as spd

pd.DataFrame() # works
gpd.GeoDataFrame() # works
spd.GeoDataFrame() # fails

Describe the solution you'd like

I would like to be able to generate empty spatialpandas GeoDataFrame.

Describe alternatives you've considered

An alternative I have is to first convert into a spatialpandas GeoDataFrame, and then filter this new DataFrame. When I do it in this order, query() is able to return an empty spatialpandas GeoDataFrame if there are indeed no matching results.

read_parquet_dask load_divisions with bounds

While using read_parquet_dask to read files written with pack_partitions_to_parquet method, passing bounds and load_divisions=True causes a KeyError. Reading the same file with one option or the other works.

Example:

from spatialpandas.io import read_parquet_dask
sdf = read_parquet_dask(path, bounds=bounds, load_divisions=True)

Error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/home/conda/store/816032bda5816104d47e08949e4ec085fd6b9a98be07c2a55cf29c652743653e-datum/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: -1

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-22-56d4a7063dd6> in <module>
----> 1 sdf_div = read_parquet_dask(pre_path, bounds=bounds, load_divisions=True)
      2 sdf_div

/home/conda/store/816032bda5816104d47e08949e4ec085fd6b9a98be07c2a55cf29c652743653e-datum/lib/python3.7/site-packages/spatialpandas/io/parquet.py in read_parquet_dask(path, columns, filesystem, load_divisions, geometry, bounds, categories)
    231         path, columns, filesystem,
    232         load_divisions=load_divisions, geometry=geometry, bounds=bounds,
--> 233         categories=categories
    234     )
    235 

/home/conda/store/816032bda5816104d47e08949e4ec085fd6b9a98be07c2a55cf29c652743653e-datum/lib/python3.7/site-packages/spatialpandas/io/parquet.py in _perform_read_parquet_dask(paths, columns, filesystem, load_divisions, geometry, bounds, categories)
    371 
    372     if load_divisions:
--> 373         divisions = div_mins + [div_maxes[-1]]
    374         if divisions != sorted(divisions):
    375             raise ValueError(

/home/conda/store/816032bda5816104d47e08949e4ec085fd6b9a98be07c2a55cf29c652743653e-datum/lib/python3.7/site-packages/pandas/core/series.py in __getitem__(self, key)
    822 
    823         elif key_is_scalar:
--> 824             return self._get_value(key)
    825 
    826         if is_hashable(key):

/home/conda/store/816032bda5816104d47e08949e4ec085fd6b9a98be07c2a55cf29c652743653e-datum/lib/python3.7/site-packages/pandas/core/series.py in _get_value(self, label, takeable)
    930 
    931         # Similar to Index.get_value, but we do not fall back to positional
--> 932         loc = self.index.get_loc(label)
    933         return self.index._get_values_for_loc(self, loc, label)
    934 

/home/conda/store/816032bda5816104d47e08949e4ec085fd6b9a98be07c2a55cf29c652743653e-datum/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:
-> 3082                 raise KeyError(key) from err
   3083 
   3084         if tolerance is not None:

KeyError: -1

spatialpandas failure on Dask Remote Cluster

Overview notebook > spatialpandas using dask works on LocalCluster, but dask errors out when using a remote cluster:

Me:

cluster = gateway.new_cluster(profile=size)
print("Scaling workers")
cluster.scale(workers)
client = cluster.get_client()

notebook code:

...

## Large spatialpandas DaskGeoDataFrame with 16 partitions
cities_large_ddf = dd.from_pandas(cities_large_df, npartitions=16).persist()

# Precompute the partition-level spatial index
cities_large_ddf.partition_sindex

len(sjoin(cities_large_ddf, world_df))

Error trace:

---------------------------------------------------------------------------
KilledWorker                              Traceback (most recent call last)
<ipython-input-91-44679f1f7790> in <module>
      3 
      4 # Precompute the partition-level spatial index
----> 5 cities_large_ddf.partition_sindex

~/.local/lib/python3.8/site-packages/spatialpandas/dask.py in partition_sindex(self)
    145                 geometry._partition_bounds = self._partition_bounds[geometry_name]
    146 
--> 147             self._partition_sindex[geometry.name] = geometry.partition_sindex
    148             self._partition_bounds[geometry_name] = geometry.partition_bounds
    149         return self._partition_sindex[geometry_name]

~/.local/lib/python3.8/site-packages/spatialpandas/dask.py in partition_sindex(self)
     66     def partition_sindex(self):
     67         if self._partition_sindex is None:
---> 68             self._partition_sindex = HilbertRtree(self.partition_bounds.values)
     69         return self._partition_sindex
     70 

~/.local/lib/python3.8/site-packages/spatialpandas/dask.py in partition_bounds(self)
     47     def partition_bounds(self):
     48         if self._partition_bounds is None:
---> 49             self._partition_bounds = self.map_partitions(
     50                 lambda s: pd.DataFrame(
     51                     [s.total_bounds], columns=['x0', 'y0', 'x1', 'y1']

/opt/conda/lib/python3.8/site-packages/dask/base.py in compute(self, **kwargs)
    277         dask.base.compute
    278         """
--> 279         (result,) = compute(self, traverse=False, **kwargs)
    280         return result
    281 

/opt/conda/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs)
    559         postcomputes.append(x.__dask_postcompute__())
    560 
--> 561     results = schedule(dsk, keys, **kwargs)
    562     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    563 

/opt/conda/lib/python3.8/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
   2682                     should_rejoin = False
   2683             try:
-> 2684                 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
   2685             finally:
   2686                 for f in futures.values():

/opt/conda/lib/python3.8/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
   1991             else:
   1992                 local_worker = None
-> 1993             return self.sync(
   1994                 self._gather,
   1995                 futures,

/opt/conda/lib/python3.8/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    837             return future
    838         else:
--> 839             return sync(
    840                 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    841             )

/opt/conda/lib/python3.8/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    338     if error[0]:
    339         typ, exc, tb = error[0]
--> 340         raise exc.with_traceback(tb)
    341     else:
    342         return result[0]

/opt/conda/lib/python3.8/site-packages/distributed/utils.py in f()
    322             if callback_timeout is not None:
    323                 future = asyncio.wait_for(future, callback_timeout)
--> 324             result[0] = yield future
    325         except Exception as exc:
    326             error[0] = sys.exc_info()

/opt/conda/lib/python3.8/site-packages/tornado/gen.py in run(self)
    760 
    761                     try:
--> 762                         value = future.result()
    763                     except Exception:
    764                         exc_info = sys.exc_info()

/opt/conda/lib/python3.8/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1856                             exc = CancelledError(key)
   1857                         else:
-> 1858                             raise exception.with_traceback(traceback)
   1859                         raise exc
   1860                     if errors == "skip":

KilledWorker: ("('from_pandas-9ecbde8f79972641c9091facd27b244c', 30)", <Worker 'tls://10.15.127.8:35903', name: dask-worker-92908e5f7ff74365bb94ce8640647029-jghbz, memory: 0, processing: 2>)

Dask log doesnt tell me much except that the task fails and dask workers get killed;
Task ('from_pandas-3cf29940396a335127f696bfff173515', 25) marked as failed because 6 workers died while trying to run it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.