GithubHelp home page GithubHelp logo

carstenpiepel / spark-esri Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mraad/spark-esri

0.0 1.0 0.0 4.11 MB

Repo to demonstrate the usage of Apache Spark within a Jupyter notebook within ArcGIS Pro

License: Apache License 2.0

Python 22.82% Jupyter Notebook 76.62% Shell 0.56%

spark-esri's Introduction

Spark ESRI

Project to demonstrate the usage of Apache Spark within a Jupyter notebook within ArcGIS Pro.

Dec 16, 2021 - Added check for env var SPARK_HOME to override built-in spark. For example, download spark-3.1.1-bin-hadoop2.7.tgz and set env var SPARK_HOME to the extracted folder location.

Oct 30, 2021 - Pro 2.8 relies on the Windows registry to find the active conda environment. The registry key is HKEY_CURRENT_USER/SOFTWARE/ESRI/ArcGISPro/PythonCondaEnv. The value of this key is used to set the required os environment variable PYSPARK_PYTHON for PySpark to work correctly in a Pro notebook.

As of this writing, the order to detect the active conda environment is as follows:

  • look for env var CONDA_DEFAULT_ENV.
  • look for %LOCALAPPDATA%/ESRI/conda/envs/proenv.txt, in case of an older Pro version.
  • look for HKEY_CURRENT_USER/SOFTWARE/ESRI/ArcGISPro/PythonCondaEnv.

Oct 27, 2021 - Pro 2.8.3 removed the reliance and existence of the file %LOCALAPPDATA%/ESRI/conda/envs/proenv.txt. It now depend on env var CONDA_DEFAULT_ENV to determine the activate conda env.

Sep 16, 2021 - Perform the following as a patch for Pro 2.8.3

cd c:\
git clone https://github.com/kontext-tech/winutils

Define a system environment variable HADOOP_HOME with value C:\winutils\hadoop-3.3.0 and add to system variable PATH the %HADOOP_HOME%/bin value.

NOTE: This works in Pro 2.6 ONLY. There is a small "issue" with Pro 2.7 and pyarrow. The folks in Redlands have a fix that will be in 2.8 :-(

Create a new Pro Conda Environment.

Start a Python Command Prompt:

Note: You might need to add proxy settings to .condarc located in C:\Program Files\ArcGIS\Pro\bin\Python.

conda config --set proxy_servers.http http://username:password@host:port
conda config --set proxy_servers.https https://username:password@host:port

The above will produce something like the below:

ssl_verify: true
proxy_servers:
  http: http://domainname\username:password@host:port
  https: http://domainname\username:password@host:port

Create a new conda environment:

proswap arcgispro-py3
conda remove --yes --all --name spark_esri
conda create --yes --name spark_esri --clone arcgispro-py3
proswap spark_esri

Optional:

pip install fsspec==2021.8.1 boto3==1.18.35 s3fs==0.4.2 pyarrow==1.0.1
conda install --yes -c esri -c conda-forge -c default^
    "numba=0.53.*"^
    "pandas=1.2.*"^
    "untangle=1.1.*"^
    "pyodbc=4.0.*"^
    "gcsfs=0.7.*"        

Install the Esri Spark module.

Note: You might need to install Git for Windows.

git clone https://github.com/mraad/spark-esri.git
cd spark-esri
python setup.py install

MicroPathing Notebook

Please note the usage of the range slider on the map to filter the micropaths between a user defined hour of day.

The following is the resulting crossing points and gates statistics.

TODO

  • Unify spark_esri and spark_dbconnect python modules.

References

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.