GithubHelp home page GithubHelp logo

pysal / access Goto Github PK

View Code? Open in Web Editor NEW
21.0 21.0 13.0 9.81 MB

Classical and novel measures of spatial accessibility to services

Home Page: https://pysal.org/access/

License: BSD 3-Clause "New" or "Revised" License

Python 11.55% Jupyter Notebook 88.44% Awk 0.01%
access spatial-statistics

access's Introduction

Python Spatial Analysis Library

Continuous Integration PyPI version Anaconda-Server Badge Discord Ruff DOI

PySAL, the Python spatial analysis library, is an open source cross-platform library for geospatial data science with an emphasis on geospatial vector data written in Python. It supports the development of high level applications for spatial analysis, such as

  • detection of spatial clusters, hot-spots, and outliers
  • construction of graphs from spatial data
  • spatial regression and statistical modeling on geographically embedded networks
  • spatial econometrics
  • exploratory spatio-temporal data analysis

PySAL Components

PySAL is a family of packages for spatial data science and is divided into four major components:

Lib

solve a wide variety of computational geometry problems including graph construction from polygonal lattices, lines, and points, construction and interactive editing of spatial weights matrices & graphs - computation of alpha shapes, spatial indices, and spatial-topological relationships, and reading and writing of sparse graph data, as well as pure python readers of spatial vector data. Unike other PySAL modules, these functions are exposed together as a single package.

  • libpysal : libpysal provides foundational algorithms and data structures that support the rest of the library. This currently includes the following modules: input/output (io), which provides readers and writers for common geospatial file formats; weights (weights), which provides the main class to store spatial weights matrices, as well as several utilities to manipulate and operate on them; computational geometry (cg), with several algorithms, such as Voronoi tessellations or alpha shapes that efficiently process geometric shapes; and an additional module with example data sets (examples).

Explore

The explore layer includes modules to conduct exploratory analysis of spatial and spatio-temporal data. At a high level, packages in explore are focused on enabling the user to better understand patterns in the data and suggest new interesting questions rather than answer existing ones. They include methods to characterize the structure of spatial distributions (either on networks, in continuous space, or on polygonal lattices). In addition, this domain offers methods to examine the dynamics of these distributions, such as how their composition or spatial extent changes over time.

  • esda : esda implements methods for the analysis of both global (map-wide) and local (focal) spatial autocorrelation, for both continuous and binary data. In addition, the package increasingly offers cutting-edge statistics about boundary strength and measures of aggregation error in statistical analyses

  • giddy : giddy is an extension of esda to spatio-temporal data. The package hosts state-of-the-art methods that explicitly consider the role of space in the dynamics of distributions over time

  • inequality : inequality provides indices for measuring inequality over space and time. These comprise classic measures such as the Theil T information index and the Gini index in mean deviation form; but also spatially-explicit measures that incorporate the location and spatial configuration of observations in the calculation of inequality measures.

  • momepy : momepy is a library for quantitative analysis of urban form - urban morphometrics. It aims to provide a wide range of tools for a systematic and exhaustive analysis of urban form. It can work with a wide range of elements, while focused on building footprints and street networks. momepy stands for Morphological Measuring in Python.

  • pointpats : pointpats supports the statistical analysis of point data, including methods to characterize the spatial structure of an observed point pattern: a collection of locations where some phenomena of interest have been recorded. This includes measures of centrography which provide overall geometric summaries of the point pattern, including central tendency, dispersion, intensity, and extent.

  • segregation : segregation package calculates over 40 different segregation indices and provides a suite of additional features for measurement, visualization, and hypothesis testing that together represent the state-of-the-art in quantitative segregation analysis.

  • spaghetti : spaghetti supports the the spatial analysis of graphs, networks, topology, and inference. It includes functionality for the statistical testing of clusters on networks, a robust all-to-all Dijkstra shortest path algorithm with multiprocessing functionality, and high-performance geometric and spatial computations using geopandas that are necessary for high-resolution interpolation along networks, and the ability to connect near-network observations onto the network

Model

In contrast to explore, the model layer focuses on confirmatory analysis. In particular, its packages focus on the estimation of spatial relationships in data with a variety of linear, generalized-linear, generalized-additive, nonlinear, multi-level, and local regression models.

  • mgwr : mgwr provides scalable algorithms for estimation, inference, and prediction using single- and multi-scale geographically-weighted regression models in a variety of generalized linear model frameworks, as well model diagnostics tools

  • spglm : spglm implements a set of generalized linear regression techniques, including Gaussian, Poisson, and Logistic regression, that allow for sparse matrix operations in their computation and estimation to lower memory overhead and decreased computation time.

  • spint : spint provides a collection of tools to study spatial interaction processes and analyze spatial interaction data. It includes functionality to facilitate the calibration and interpretation of a family of gravity-type spatial interaction models, including those with production constraints, attraction constraints, or a combination of the two.

  • spreg : spreg supports the estimation of classic and spatial econometric models. Currently it contains methods for estimating standard Ordinary Least Squares (OLS), Two Stage Least Squares (2SLS) and Seemingly Unrelated Regressions (SUR), in addition to various tests of homokestadicity, normality, spatial randomness, and different types of spatial autocorrelation. It also includes a suite of tests for spatial dependence in models with binary dependent variables.

  • spvcm : spvcm provides a general framework for estimating spatially-correlated variance components models. This class of models allows for spatial dependence in the variance components, so that nearby groups may affect one another. It also also provides a general-purpose framework for estimating models using Gibbs sampling in Python, accelerated by the numba package.

    โš ๏ธ Warning: spvcm has been archived and is planned for deprecation and removal in pysal 25.01.

  • tobler : tobler provides functionality for for areal interpolation and dasymetric mapping. Its name is an homage to the legendary geographer Waldo Tobler a pioneer of dozens of spatial analytical methods. tobler includes functionality for interpolating data using area-weighted approaches, regression model-based approaches that leverage remotely-sensed raster data as auxiliary information, and hybrid approaches.

  • access : access aims to make it easy for analysis to calculate measures of spatial accessibility. This work has traditionally had two challenges: [1] to calculate accurate travel time matrices at scale and [2] to derive measures of access using the travel times and supply and demand locations. access implements classic spatial access models, allowing easy comparison of methodologies and assumptions.

  • spopt: spopt is an open-source Python library for solving optimization problems with spatial data. Originating from the original region module in PySAL, it is under active development for the inclusion of newly proposed models and methods for regionalization, facility location, and transportation-oriented solutions.

Viz

The viz layer provides functionality to support the creation of geovisualisations and visual representations of outputs from a variety of spatial analyses. Visualization plays a central role in modern spatial/geographic data science. Current packages provide classification methods for choropleth mapping and a common API for linking PySAL outputs to visualization tool-kits in the Python ecosystem.

  • legendgram : legendgram is a small package that provides "legendgrams" legends that visualize the distribution of observations by color in a given map. These distributional visualizations for map classification schemes assist in analytical cartography and spatial data visualization

  • mapclassify : mapclassify provides functionality for Choropleth map classification. Currently, fifteen different classification schemes are available, including a highly-optimized implementation of Fisher-Jenks optimal classification. Each scheme inherits a common structure that ensures computations are scalable and supports applications in streaming contexts.

  • splot : splot provides statistical visualizations for spatial analysis. It methods for visualizing global and local spatial autocorrelation (through Moran scatterplots and cluster maps), temporal analysis of cluster dynamics (through heatmaps and rose diagrams), and multivariate choropleth mapping (through value-by-alpha maps. A high level API supports the creation of publication-ready visualizations

Installation

PySAL is available through Anaconda (in the defaults or conda-forge channel) We recommend installing PySAL from conda-forge:

conda config --add channels conda-forge
conda install pysal

PySAL can also be installed using pip:

pip install pysal

As of version 2.0.0 PySAL has shifted to Python 3 only.

Users who need an older stable version of PySAL that is Python 2 compatible can install version 1.14.3 through pip or conda:

conda install pysal==1.14.3

Documentation

For help on using PySAL, check out the following resources:

Development

As of version 2.0.0, PySAL is now a collection of affiliated geographic data science packages. Changes to the code for any of the subpackages should be directed at the respective upstream repositories, and not made here. Infrastructural changes for the meta-package, like those for tooling, building the package, and code standards, will be considered.

Development is hosted on github.

Discussions of development as well as help for users occurs on the developer list as well as in PySAL's Discord channel.

Getting Involved

If you are interested in contributing to PySAL please see our development guidelines.

Bug reports

To search for or report bugs, please see PySAL's issues.

Build Instructions

To build the meta-package pysal see tools/README.md.

License information

See the file "LICENSE.txt" for information on the history of this software, terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES.

access's People

Contributors

20bryan avatar jamessaxon avatar jgaboardi avatar knaaptime avatar ljwolf avatar martinfleis avatar puttingscienceintodatascience avatar sjsrey avatar tayloroshan avatar vidal-anguiano avatar weikang9009 avatar yatlas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

access's Issues

Raam outputs

Hi,

I am looking at using your RAAM model, but I'm a little confused about the code.
I was hoping you would be able to explain in further detail three things.

  1. I am not really sure what exactly the output number of the RAAM model represents, is it the fractional deviation from the national mean as shown in Figure 1 in your paper? if yes then that means the final number to be the RAAM(r)/RAAM - 1, in which case what does the RAAM raw number represent?
  2. Am I right saying that if I want to incorporate weighting on the distance in the cost matrix, in the Raam model, I need to do that outside the function.
  3. I would like to incorporate the distance buffer (tau) and I think what you guys do is as simple as distance/tau, but I just wanted to confirm that and ask if there is any effect of applying weights on the RAAM output and if that needs to be accounted for in the interpretation.

The package does not depend on scipy

This is a minor issue but scipy can be a hard to install and listing it as a package dependency, while it seems to be used only in a notebook, can make this package a bit hard to install in some system. Maybe consider moving it to a test/doc/dev dependency file?

Kudos @CJ-Wright who found this with his new dependency audit tool in conda-forge.

[BUG] download test fails on windows

there are technically two failures, but theyre cascading. The root is

E OSError: Invalid data stream

so something about the download is failing on windows, which causes the second failure to load the file

[2SFCA/3SFCA] KeyError: 'xxx_W'

I ran into this issue while using the Pysal Access package today, when using two or three stage FCA models, you end up with the key error in the title where it is unable to find a column of the form <demand value>_W. You can see two such errors below:

Two Stage FCA:
WError

Three Stage FCA
3SFCAError

The errors above are from a tutorial notebook, seen here on nbgitviewer

From looking over the code, it would seem that weighted_catchment (in fca.py) used to return a series with the name <demand value>_W, but does not any longer. The offending line in two_stage_fca is here. This is a quick fix and I'll submit a PR, but opening the issue here in case others run into it.

fix version names in docs

@jGaboardi, ever since the update to versioneer (#39), I noticed that the versions listed in the docs show up weird -- like 0+untagged.1.g109a527:

image

Could you either advise how to fix this with a new tag, or do so and mention me?

pip install installs extra dependencies (Sphinx==2.4.3 in particular)

Installing via pip install access requires Sphinx==2.4.3.

NetworkX is adding several geospatial examples to our gallery (networkx/networkx#4366, networkx/networkx#4383, networkx/networkx#4407). As a result, building our documentation now requires pysal>=2.3, which requires access>=1.1.1. Building our docs requires Sphinx==3.3.1. So we have to take special care to install our required packages on our CI system, which isn't a big deal but is annoying. I also suspect other packages could run into this issue.

In your setup.py

  • install_requires is read in from requirements.txt
  • extras_require is read in from requirements_tests.txt and requirements_docs.txt

So if I clone your repo and install via pip install ., then dependences in extra_requires are not installed. But if I install from pypi, then the dependences in extra_requires are installed.

Would it be possible to remove at least the dependencies in requirements_docs.txt from the PyPI wheel required dependencies?

Change max cost to include threshold

This is a change was requested by @knaaptime in the discussion for 26. It is a one-character fix here:

if max_cost is not None: temp = temp[temp[cost_cost] < max_cost].copy()

But it will also break the expectations of the zero-catchment tests here:

def test_floating_catchment_area_ratio_zero_catchment(self):
zero_catchment = 0
result = self.model.fca_ratio(max_cost = zero_catchment)
actual = math.isnan(self.model.access_df.iloc[0]['fca_value'])
self.assertEqual(actual, True)

Not a hard change but your judgment call.

[BUG]: demand estimation is incorrect with asymmetric travel matrices

the fca_ratio, two_stage_fca, and three_stage_fca attempt a reverse search by switching origins and destinations to estimate potential demand at each supply location. But that approach only works for very simple pedestrian travel networks, which assume no congestion and no one-way edges. In realistic applications like an integrated ped/transit network or an automobile skim from a travel demand model, though, the network is directed and edge weights are not equivalent between two nodes--it's harder to get into Chicago during the morning rush than out of it. In those cases, the current implementation will overstate potential demand because it assumes free-flow travel on a network that's actually congested. A solution would be to reverse the impedance as well as the ODs in the "demand" call to weighted_catchment

tl;dr: demand needs to be weighted by the cost for each consumer to reach the supply, not the cost for the supply to be delivered to the consumer. Those directions have different costs

Release on conda-forge

All the pysal subpackages are available on both pypi and conda-forge. access is out on pypi so we need to add a conda-forge recipe to have access there as well.

index name hardcoded to 'geoid'

this isn't strictly a problem, but we might want to avoid naming the index explicitly to 'geoid', which usually means FIPS code. If folks arent using census data, but they get back a dataframe indexed on 'geoid' it might cause some confusion

Hard coded origin name

Line 302 of fca.py
W3sum_frame = cost_df[["origin", "W3"]].groupby('origin').sum().rename(columns = {"W3" : "W3sum"}).reset_index()
I believe should be
W3sum_frame = cost_df[[cost_origin, "W3"]].groupby(cost_origin).sum().rename(columns = {"W3" : "W3sum"}).reset_index()

The current one hard-codes the name of cost origin which is inconsistent with code and causes an issue when the demand and supply indices are not equal.

`sphinx` enforced to install w/ packages

I've noticed that the requirements pin to sphinx, is this needed for the functionality to run? If not, it might make installs easier to not require sphinx (particularly a specific version).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.