GithubHelp home page GithubHelp logo

opendataanalytics / gaia Goto Github PK

View Code? Open in Web Editor NEW
31.0 18.0 15.0 9.51 MB

Gaia is a geospatial analysis library jointly developed by Kitware and Epidemico.

CMake 1.08% Python 56.34% Jupyter Notebook 42.30% Dockerfile 0.28%
analytics gdal python remotesensing kitware open-source data-science ipython jupyter dask

gaia's People

Contributors

aashish24 avatar andrenguyen-bah avatar chuehlien avatar danlipsa avatar dorukozturk avatar dstoup avatar ebradyjobory avatar fx2323 avatar geordgez avatar gitter-badger avatar jbeezley avatar johnkit avatar kotfic avatar lou-epidemico avatar manthey avatar matthewma7 avatar scottwittenburg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gaia's Issues

Add geocoding to Gaia

@aashish24 please edit/comment as necessary.

Use geopy to add geocoding capability to Gaia

  • new GeocodeProcess class
  • use the geocoder specified in process argument (options include OpenStreetMap Nominatim, ESRI ArcGIS, Google Geocoding API (V3), Baidu Maps, Bing Maps API, Yahoo! PlaceFinder, Yandex, IGN France, GeoNames, NaviData, OpenMapQuest, What3Words, OpenCage, SmartyStreets, geocoder.us, and GeocodeFarm)
  • default geocoder if none specified will be: Google?
  • Allow reverse geocoding option (given coordinates, find nearest placename)
  • Output.data should be a geopandas dataframe

Add description of the project to the README file.

For example:

Gaia provides data processing, transformation, and analysis capabilities specifically targeted for spatial datasets. It is built on top of popular open source packages such as GDAL,  PROJ4, NUMPY, OpenClimateGIS, Shapely and Fiona. It can fetch data from multiple sources such as tile servers and databases. 

Conda Package

Hello,

I am interested in making this available via Anaconda's conda-forge channel (see: https://github.com/conda-forge/staged-recipes). Anaconda makes it simpler for testing and deploying your code in multiple operating systems. Is this something you would be interested in?

dependency Fiona

There are places in the code base that use Fiona, but why it's not defined in requirements.txt or setup.py?

Add geopandas to requirements.txt

  • add geopandas==0.1.1 to requirements.txt
  • add matplotlib==1.4.3 to requirements-dev.txt
  • remove separate install of geopandas and matplotlib from .travis.yml
  • add extras_require = { 'geopandas': ["geopandas>=0.1.1"] } to setup.py

Refactor request parsing

@aashish24 This is in regards to your suggestion yesterday to refactor how/where request parsing takes place, could you provide some more details? I think you mentioned having request JSON parsed by the process object itself? But the request parser in it's current state is responsible for actually creating the process.

Here is an example of the current request parser being used in Girder:
screen shot 2016-02-24 at 9 46 53 am

Add pysal spatial dynamics processes

Hi @aashish24, do we want all the processes that are currently listed under spatial dynamics?
http://pysal.readthedocs.org/en/latest/library/spatial_dynamics/index.html

  • spatial_dynamics.directional – Directional LISA Analytics
  • spatial_dynamics.ergodic – Summary measures for ergodic Markov chains
  • spatial_dynamics.interaction – Space-time interaction tests
  • spatial_dynamics.markov – Markov based methods
  • spatial_dynamics.rank – Rank and spatial rank mobility measures

And do we still want spatially constrained clustering?
http://pysal.readthedocs.org/en/latest/library/region/maxp.html

Update documentation for Gaia

Modify/add to documentation as new functionality is added or existing functionality is changed.
Also fix broken ReadTheDocs build caused by absence of required linux binaries (PR 49)

Different scenario causes cropping error

The cropping for vector and raster data works for general cases.
But there is some scenario will cause errors.

  • Raster data
    • Crop region is bigger than the target
    • Crop region is entirely outside the target area
    • Some maybe special RGB datsaet will lose color in the cropped result
  • Vector data
    • Crop region is entirely outside the target area

I uploaded some example data here for these cases.

Geotrellis integration with Gaia

@dorukozturk's geotrellis-pipeline looks like a good example of how to approach this: creating JAR's for Geotrellis apps/functions and calling them via subprocess in python.

  • For the Landsat example, maybe we could modify it so that the MaskNearRedAndInfrared and CreateNDVI methods are available outside of a webserver, and operate on/output an entire geotiff rather than tiles? If that approach makes sense, then use it to implement other processes like generic raster math, subsetting, etc.
  • Spark integration: should Spark always be used with Geotrellis functions? Or used only for certain processes and/or only if an optional parameter is supplied?

Dead kernels reading certain raster files

Note that this could be an issue with my local machine/environment since the tests using the same datasets appear to be fine.

In gaia.geo.geo_inputs, when using the read function of a RasterFileIO object, large raster files kill my iPython Notebook kernel e.g., globalprecip below (~300kb):

globalprecip = RasterFileIO(uri='../../tests/data/globalprecip.tif')
gp_ar = mb_nodata_small.read() # running this line kills the kernel

I don't run into any problems with '../../tests/data/globalairtemp.tif' (~100kb) and smaller files.

Update and review the README.md

  • Currently the layout is not quite right (gitter chat shows up in the middle of the README).
  • It does clearly mention Kitware tools
  • Flow is not quite right as it is too detail at few places.

Use gippy for raster image processing

Instead of using GDAL directly, look into using gippy to process raster data because it can better handle memory management for processing large images and chaining of image processing tasks.

anaconda / conda-forge installation

Hi I was curious about installing Gaia and deps in a conda environment, and found a couple helpful things in this repo...

  • the conda_env.yml file at the root level seems very promising indeed
  • from this commit 92799f5
  • and presumably related issues #79 and #80
    ...but I don't see anything (that I think is) official or anything in the conda-forge channel.

What is the recommended guidance for a conda install as of Feb-2020?
The kitware-danesfield channel installed like a champ for me, but I figured I'd ask.

image

Restructure gaia

  • Provide object oriented API for data conversion
  • Provide high level custom analysis
  • Focus on big data analysis

Plugin framework for Gaia

Goal: Allow 3rd-party developers to add plugins to Gaia

Basic design:

  • add an empty plugins module to gaia, where additional plugin submodules can be placed.
  • each module can optionally include a requirements.txt file and config file.
    • Manually install the requirements if necessary (pip install requirements.txt)
  • Gaia's current config parser will recursively search through the plugins folder to add any configuration settings.
  • Examples to include a gaia plugin in a script:
    • import gaia.plugins.my_plugin
    • from gaia.plugins.my_plugin import MyGaiaPluginIO, MyPluginProcessor

@aashish24 @dorukozturk please review.

Specify a dependencies list

The system dependencies are only specified in .travis.yml, which is probably fine for now. Travis currently uses Ubuntu 12.04 images, and so the only dependency needed is libgdal-dev.

matplotlib installed via pip on Ubuntu 14.04 requires libpng and freetype, though matplotlib is only required for testing currently.

Maybe just tracking this here in the issue is good enough for now.

Gaia integration with Minerva/Girder

Goal

Run Gaia processes from within Minerva

General requirements

  • Gaia needs to be able to read from/write to data stored in Girder. Possible approaches:
    • Minerva sends Gaia the girder item id (and authentication token?). Gaia then loads the data directly from the Girder backend using the REST API, and writes result back to same folder.
    • Minerva front-end sends Gaia the input data directly as geojson (for vector data, but what about raster data?). REST API still used to write output?
  • Gaia should run processes for Minerva using the same framework as used for other jobs/analyses. This likely means integration with girder-worker.

@jbeezley @kotfic @aashish24 @dorukozturk let me know if you have any thoughts/suggestions on this, thanks.

Geoserver/Geonode endpoints via Girder plugin

@aashish24 Current endpoints for a typical geonode installation:

Add pysal spatial weight functions to gaia

Many spatial analysis functions in pysal rely on spatial weights, so we're adding this first.

TODO:

  • add WeightProcess class to processes.py
  • add WeightFileIO class to inputs.py
  • test WeightProcess class

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.