GithubHelp home page GithubHelp logo

npucino / sandpyper Goto Github PK

View Code? Open in Web Editor NEW
11.0 3.0 1.0 95.52 MB

Tools for beach volumetric and behavioral dynamics monitoring from multitemporal UAV-SfM datasets.

Home Page: https://npucino.github.io/sandpyper/

License: MIT License

Python 89.35% TeX 10.65%
uav-imagery coastal-dynamics remote-sensing satellite-data shoreline erosion beach

sandpyper's Introduction

image Contributors image image image Forks Stargazers Issues License joss DOI

LinkedIn


Logo

Sandpyper

- Sandy beaches SfM-UAV analysis tools -

Sandpyper performs an organised and automated extraction of color and elevation profiles from as many DSM and orthophotos as you like. It is thought to be used when a considerable number of DSMs and orthophotos from many different locations and coordinate reference systems need to be processed. Then, it computes volumetric and behavioural analysis of sandy beachfaces, speeding up an otherwise long and difficult to handle job. It has some specialised functions to help dealing with the common limitations found in working with Unoccupied Aerial Vehicles (UAVs) and Structure from Motion (SfM) in beach environments, which are:

  1. Swash zone: the water motion of waves washing in and out of the swash zone prevents SfM algorithm to reliably model elevation. It is commonly discarded.
  2. Vegetation: both dune vegetation and beach wracks (macroalgae, woody debris) should be removed or filtered as anything that is not sand would compromise sediment volumetric computation and behavioural analysis.
  3. File size: a few km long beach surveyed with a DJI Phantom 4-Advanced at 100 meters altitude creates roughly 10 Gb (uncompressed) of data, which can be cumbersome for some GIS to handle.

From user-defined cross-shore transects, Sandpyper helps with:

  1. cleaning profiles from unwanted non-sand points
  2. computing period-specific limits of detection to obtain reliable estimates of changes
  3. detecting statistically significant clusters of change (also referred to hotspots/coldspots) of beach change
  4. computing multiscale volumetric analysis
  5. modeling multiscale Beachface Cluster Dynamics indices
  6. visualising beach changes, limits of detection, transects and BCDs with a variety of in-built plotting methods

Additionally, Sandpyper has some useful functions that can come at hand, such as: automatic transect creation from a vector line, grid creation along a line and subsequent tiles extraction and others.

  1. automatic transects creation from a vector line
  2. spatial grid creation of specified tile size along a line
  3. tiles extraction from spatial grid

Sandpyper is very easy to use.


banner


As the above image shows, Sandpyper processing pipeline is mainly composed of 3 main components:

  1. Raw data extraction
  2. Data correction
  3. Sediment dynamics analysis

To achieve this in an organised and coherent way, Sandpyper provides two core objects, the ProfileSet and the ProfileDynamics classes.
The ProfileSet class sets up the monitoring global parameters, extracts the change data and Limit of Detections, implements iterative silhouette analysis with inflexion point search and facilitates point cleaning by using user-provided watermasks, shoremasks and class dictionaries.
The ProfileDynamics class is the core processor for beach dynamics. It computes multitemporal and multiscale elevation change, performs Hotspot/Coldspot analysis at the location level, discretises the data into classes of magnitude of change, models multiscale behavioural dynamics and provides many different plotting options.
After instantiation, both classes will gradually store more and more beach change information in their attributes.

Follow the Jupyter Notebook tutorials to understand how it works!

This code has supported the analysis and publication of the article "Citizen science for monitoring seasonal-scale beach erosion and behaviour with aerial drones", in the open access Nature Scientific Report journal.


Explore the docs » Report Bug » Request Feature

Table of Contents

  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Publications
  9. Acknowledgements

About The Project

banner

Background

Sandpyper has been originally developed to facilitate the analysis of a large dataset coming from more than 300 Unoccupied Aerial Vehicles (UAV) surveys, performed by Citizen Scientist in Victoria, Australia. This is the Eureka Award Winning World-first beach monitoring program powered by volunteers, who autonomously fly UAVs on 15 sensitive sites (erosional hotspots) across the Victorian coast, every 6 weeks for 3 years, since 2018. This project is part of a broader marine mapping program called The Victorian Coastal Monitoring Program (VCMP), funded by the Victorian Department of Environment, Land, Water and Planning, co-funded by Deakin University and The University of Melbourne.

Each survey creates Digital Surface Models (DSMs) and orthophotos of considerable size (5-10 Gb uncompressed), which can be troublesome for some GIS to render, let alone perform raster-based computations.

Modules

  • sandpyper: main module where the ProfileSet and ProfileDynamics classes and methods are defined.
  • common: where all the functions are stored.

Getting Started

Currently Sandpyper is tested on Windows, MacOS and Ubuntu with Python 3.8 and 3.9. To get a local copy up and running follow these simple steps.

Prerequisites

  • Install Conda in your local machine. We need it to create the sandpyper_env virtual environment.

  • then, if you do not have it already, install Visual Studio C++ build tools . You can download it here.

  • If you don't have it already, add conda-forge channel to your anaconda config file, by typing this in your Anaconda Prompt terminal (base environment):

    conda config --add channels conda-forge
  • Now, always in the (base) environment, create a new environment called sandpyper_env using python=3.9 and install the required packages by typing:

    conda create --name sandpyper_env python=3.9 geopandas=0.8.2 matplotlib=3.3.4 numpy=1.20.1 pandas=1.2.2 tqdm=4.56.2 pysal=2.1 rasterio=1.2.0 richdem=0.3.4 scikit-image=0.18.1 scikit-learn=0.24.1 scipy=1.6.0 seaborn=0.11.1 tqdm=4.56.2 pooch=1.4.0 fuzzywuzzy
  • If rasterio package cannot be installed due to GDAL binding issues, follow the instructions in rasterio installation webpage.

  • If you want to test the package using the provided notebooks, download the test data (test_data.rar) HERE

Installation

  1. conda activate sandpyper_env
  2. pip install sandpyper
  3. Install Jupyter Notebooks:
    conda install jupyter
  4. Once you open a Jupyter Notebook with the sandpyper_env, import it to test it works.
    import sandpyper

Usage

To see Sandpyper in action, follow the Jupyter Notebooks provided here.
Download the test data file (test_data.rar) HERE.
For the API reference, see the Documentation.

Roadmap

  1. Update CRS definition to new CRS object standard in order to upgrade Geopandas version.
  2. Relax all requirements to make it future-proof and available in Anaconda.org.
  3. Add raster support for Dems of Differences (DoDs) and LoDs.
  4. Add shoreline analysis from space.
  5. Add [leafmap](https://github.com/giswqs/leafmap) to better (interactive) plotting.
  6. Add automatic check for overlapping label correction polygons.

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Nicolas Pucino: @NicolasPucino - [email protected]
Project Link: https://github.com/npucino/sandpyper

Publications

Acknowledgements

sandpyper's People

Contributors

npucino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

jllovell

sandpyper's Issues

Use unsupervised classification to detect sand pixels

From working through the example notebooks, it seems the most labor intensive step is to detect which points are sand. Currently, sandpyper uses an unsupervised clustering algorithm to group pixels, but a human operator still needs to review the grouping and assign the clusters to sand. From my understanding, this would need to be done for each new dem/ortho file.

A potential improvement would be directly detect which pixels in the dem correspond to sand pixels. https://github.com/kvos/CoastSat has an approach based on satellite imagery which works well. I don't think this can be directly transferred to sandpyper though as the CoastSat relies on multispectral bands (R, G, B, NIR, SWIR1) while sandpyper would be only used on aerial imagery. Nevertheless, if a classifier can be trained to detect sand solely on the aerial imagery, it might result in a less labour intesive processsing pipeline.

RjxpLYiqWf

@npucino, I'm sure you've thought of how best to classify sand points - did you have any thoughts about the CoastSat approach? Note this is a suggestion just for future improvement and outside the scope of the JOSS review.

[Security] Workflow docs.yml is using vulnerable action s-weigand/setup-conda

The workflow docs.yml is referencing action s-weigand/setup-conda using references v1. However this reference is missing the commit a30654e576ab9e21a25825bf7a5d5f2a9b95b202 which may contain fix to the some vulnerability.
The vulnerability fix that is missing by actions version could be related to:
(1) CVE fix
(2) upgrade of vulnerable dependency
(3) fix to secret leak and others.
Please consider to update the reference to the action.

[JOSS REVIEW] Cannot run P.cleanit()

Comments are for openjournals/joss-reviews#3666 (comment).

Following 2 - Profiles extraction, unsupervised sand labelling and cleaning.ipynb, I get a MergeError: Merge keys are not unique in right dataset; not a one-to-one merge error when running P.cleanit.

"P.cleanit(l_dicts=l_dicts,\n",
" watermasks_path=watermasks_path,\n",
" shoremasks_path=shoremasks_path,\n",
" label_corrections_path=label_corrections_path)"

Looks like the validation when merging the dataframe is failing, potentially because of multiple matches in the dataframe?

sandpyper/sandpyper/common.py

Lines 2268 to 2269 in ce542c6

classed_df_finetuned=to_clean_classified.merge(right=to_update_finetune.loc[:,['point_id','finetuned_label']], # Left Join
how='left', validate='one_to_one')

I saved the to_clean_classified and to_update_finetune dataframes before the error is thrown:

[JOSS REVIEW] Example notebook minor suggestions

Hi @npucino,

Great work in creating these example notebooks, it was pretty easy for me to work through all three notebooks and understand how everything in the package works.

My comments below are mainly minor suggestions about providing some additional explaination for anyone using these notebooks. Comments are for openjournals/joss-reviews#3666 (comment). I've created two other issues for some trickier problems (#7, #8).

1 - Introduction and data preparation.ipynb

  1. Can you define the term "LoD" before you use it here? I had to scroll down a bit before I got the definition.

    "* __ProfileSet__ class: manages the elevation and colour data extraction and LoD analysis\n",

  2. When talking about the demo data, I think it should be more clear that these are the inputs that sandpyper needs (and not the output that sandpyper generates). It would also be helpful to briefly describe how each input is typically created, i.e. orthos and dems are stitched together using photogrammetry software (Pix4D), while transect and mask files are manually created in a GIS software (QGis, ARCGIS). You've provided examples later in the notebook, it'd be good just add a summary sentence or two here as well.

    "The __demo_data.rar__ archive includes all the data you need to get you started and understand how Sandpyper works. This include:\n",

  3. We need to change the path of a lot of path varibles, depending where we've saved the test_data. I suggest defining the parent path first and constructing the other paths based on this. Something like:

from pathlib import Path

test_data_folder = 'C:\Users\Chris\Desktop\sandpyper\examples\test_data'
ortho_path=Path(test_data_folder,"\orthos_1m\leo_20180920_ortho_resampled_1m.tif")
watermasks_path=Path(test_data_folder,"\clean\watermasks.gpkg")
shoremasks_path=Path(test_data_folder,"\clean\shoremasks.gpkg")
label_corr_path=Path(test_data_folder,"\clean\label_corrections.gpkg")
transect_path=Path(test_data_folder,"\transects\leo_transects.gpkg")
transect_lod_path=Path(test_data_folder,"\lod_transects\leo_lod_transects.gpkg")

"ortho_path=r\"C:\\my_packages\\sandpyper\\examples\\test_data\\orthos_1m\\leo_20180920_ortho_resampled_1m.tif\"\n",
"watermasks_path=r\"C:\\my_packages\\sandpyper\\examples\\test_data\\clean\\watermasks.gpkg\"\n",
"shoremasks_path=r\"C:\\my_packages\\sandpyper\\examples\\test_data\\clean\\shoremasks.gpkg\"\n",
"label_corr_path=r\"C:\\my_packages\\sandpyper\\examples\\test_data\\clean\\label_corrections.gpkg\"\n",
"transect_path=r\"C:\\my_packages\\sandpyper\\examples\\test_data\\transects\\leo_transects.gpkg\"\n",
"transect_lod_path=r\"C:\\my_packages\\sandpyper\\examples\\test_data\\lod_transects\\leo_lod_transects.gpkg\"\n",

  1. Can you double check the leo features in the label corrections file (examples/test_data/clean/label_corrections.gpkg) are in the correct location? Looks like some CRS mistake when I try load it in QGIS. The mar features in the correct location though.

    "<font size=\"5\"><center> <b>Label correction file</b></center></font>\n",

  2. I think an additional sentence explaining why points are being classified here would be good. Also update the reference to example notebook 2.

    "Label correction files (geopackages or shapefiles) are digitised over points which have cluster labels (assigned by KMeans algorithm) which we are not totally happy with. The attribute __target_label_k__ specifies which label k will be affected by the correction, leaving untouched all other points falling within the polygon but having different label k. This is useful to fine-tune the point classification, as it is covered in the notebook __AAAAAAA__. If you want to apply the correction to all the points, regardless of the label k, just assign 999 to this field.\n",

2 - Profiles extraction, unsupervised sand labelling and cleaning.ipynb

  1. Again, easier to use the pathlib.Path approach here to define the parent folder rather than changing each variable.

    "dirNameDSM=r'C:\\my_packages\\sandpyper\\examples\\test_data\\dsm_1m'\n",
    "dirNameOrtho=r'C:\\my_packages\\sandpyper\\examples\\test_data\\orthos_1m'\n",
    "dirNameTrans=r'C:\\my_packages\\sandpyper\\examples\\test_data\\transects'\n",

  2. Add the command to export the dataframe to csv: P.profiles.to_csv('profiles.csv')

3 - Profile dynamics.ipynb

(I couldn't link to line numbers in this file - perhaps the notebook is too big for github?)

  1. Section: "Plot: LoD normality check": The normality check plots show that the distributions fail the normality checks. Can you talk about what (if any) implications this has?

  2. Section: "Plot: multi-scale MECS and volumetrics": The plots for D.plot_transect_mecs(location='mar',tr_id=10) and D.plot_transect_mecs(location='mar', lod=D.lod_df, tr_id=10) seem fairly similar (though I see some variation in the shaded area). Could you add a couple of sentences discussing the difference between using LoD and not?

Compatibility with Ubuntu

Package tests started to fail in Ubuntu

Up until commit 74b54e0, package testing worked for Windows, Mac and Ubuntu, in Python 3.8 and 3.9.

After that, tests on Ubuntu stopped working, which might be related to changes to the Ubuntu virtual environments.
https://github.com/actions/virtual-environments/labels/Announcement

However, by relaxing the requirements in the requirements.txt file as below:

Original requirements

geopandas==0.8.2
matplotlib==3.3.4
numpy==1.20.1
pandas==1.2.2
pysal==2.1
rasterio==1.2.0
richdem==0.3.4
scikit-image==0.18.1
scikit-learn==0.24.1
scipy==1.6.0
seaborn==0.11.1
tqdm==4.56.2
pooch==1.4.0
fuzzywuzzy

Relaxed requirements

geopandas==0.8.2
matplotlib
numpy
pandas
pysal==2.1
rasterio==1.2.0
richdem
scikit-image
scikit-learn
scipy
seaborn
tqdm
pooch==1.4.0
fuzzywuzzy

tests fail in all OS, which makes me thing is a packages-platform issue.

I need to update the code to use the latest packages and test again compatibility with Ubuntu.

[JOSS REVIEW] Notebook #2 incorrect label_k values?

Comments are for openjournals/joss-reviews#3666 (comment).

Following 2 - Profiles extraction, unsupervised sand labelling and cleaning.ipynb I tried plotting the given label_k values for leo_20180606, with the profile dataframe I saved but they don't look quite right (see plot below). Can you confirm the values given in water_dict, no_sand_dict etc in the notebook are correct? Or is the purpose of the label_correction.gpkg to fix this? I'm also wondering if there is some random state that results in different label numbers if you rerun the kmeans clustering?

"In the St. Leonards survey of the 13th July 2018, the label_k 1,3 and 4 are sand, while the label_k 2 and 6 are water.\n",
"In Marengo, the 20th September 2018, no label_k represents sand while 1,2,3 and 4 are water.\n",
"\n",
"Here below are reported the label dictionaries of the demo data."

WQyEhjdWsb

[JOSS REVIEW] Install instructions

Installation instructions should be improved - as currently written, the instructions are not an optimal way to ensure success across different platforms and for relatively novice users who don't necessarily know the pitfalls with using pip and conda together in this way ...

and, I cannot get them to work, because the pip repository (why?) is packaged with dependencies that break the conda environment. In fact, the installation is a mess for a number of reasons:

  1. the instructions don't mention what version of python to use, which is crucial. You should recommend a version that you know to work across multiple platforms. I am initially testing on Windows, so my python version will naturally be ahead of my Linux computer

  2. installing some dependencies prior to installing the conda environment is a VERY bad idea because they'll install into the base environment, so won't be accessible to the sandpyper environment, and also would very likely break the base environment

  3. You should consider providing a conda yml file

  4. Further, you are being very restrictive with these versions (that are not compatible with my conda install and OS, for example)

conda install geopandas=0.8.2 matplotlib=3.3.4 numpy=1.20.1 pandas=1.2.2 tqdm=4.56.2 pysal=2.1 rasterio=1.2.0 richdem=0.3.4 scikit-image=0.18.1 scikit-learn=0.24.1 scipy=1.6.0 seaborn=0.11.1 tqdm=4.56.2 pooch=1.4.0 fuzzywuzzy

and that is bad way to 'future-proof' your installation (sorry, but you WILL be getting issues about this). Further, that's not a good way to use conda - conda is designed to tell YOU what versions YOU need, and the requirements.txt file would be the usual place to put really specific versions (for a pip install)

  1. There is no need to install richdem and visual studio tools (thatmight require admin privileges). Conda will install vs build tools for you that for you if you DONT provide the specific version numbers. And richdem is installed by conda, so that's a confusing instruction to install that separately. You can therefore remove that confusing step.

  2. the jupyter package should be installed with the rest, because of the very complex dependencies that might break conda with a posthoc installation. Also the package is called jupyter, not jupyter notebook (that is the command you run afterwards)

  3. the pip packages sandpyper ships with versions of numpy, pandas, rasterio and matplotlib that either conflict with the conda environment, or should simply be installed with the conda environment. The pip installation also raises GDAL dependency errors. That should be in the conda environment!

Conda and pip have very different purposes. I STRONGLY recommend you make a conda environment with all of the dependencies contained within. Its not at all clear to me why the package itself has to be a pip installation, but if you insist of that, it should ship with no additionally dependencies. At which point, there's no point making it a pip installation - you see my point? Its very confused.

I am therefore recommending the following approach, creating an environment with a specific python version

conda create --name sandpyper_env python=3.7

then installing ALL the dependencies inside it.

conda install geopandas matplotlib numpy pandas tqdm pysal rasterio richdem scikit-image scikit-learn scipy seaborn tqdm pooch fuzzywuzzy jupyter gdal

That approach a) was the only thing that worked for me, b) is superior for the reasons I describe above, and c) circumvents the need to install richdem and visual studio tools

Therefore the entire installation could be simply for following

conda config --add channels conda-forge

then

conda create --name sandpyper_env python=3.7

then

conda activate sandpyper

conda install geopandas matplotlib numpy pandas tqdm pysal rasterio richdem scikit-image scikit-learn scipy seaborn tqdm pooch fuzzywuzzy gdal

and finally

pip install sandpyper

(however, that results in GDAL dependency errors because pip/pypi is not appropriate here)

so, use pip/git

pip install git+https://github.com/npucino/sandpyper.git

that also gives GDAL errors

So, I cant install on windows. I cant review the code until I can install

I strongly recommend testing installation instructions on multiple machines before posting them

Minor manuscript suggestions

Hi @npucino, please see below for my comments on the manuscript. They are minor and hopefully my suggestions improve the clarity of the paper. If anything is unclear, just let me know.

Comments as part of openjournals/joss-reviews#3666 .

Introduction

  1. Line 12: "Intro": Suggest changing this to "Introduction" for a bit more formality.
  2. Line 13-14: "it is increasingly": Phrasing of the first sentence is a bit awkward. Suggest something like: "Coastal zones host 40% of the world population with continued expected growth to be focused in least developed countries".
  3. Line 19: "risk to erosion": Can add references to coastal monitoring programs, e.g. Turner16, Ruessink19, Ludka19
  4. Line 20: "Unmanned Aerial Vehicles": The README defines UAVs as "Unoccupied Aerial Vehicles". I suggest changing 'Unmanned' to 'Unoccupied' to be consistent.
  5. Line 21: "emerging as the best platform": I think "best" is a bit too definitive, as each platform has its pros and cons. Suggest changing to "emerging as an ideal platform" or something similar.
  6. Line 23: "at the mesoscale, a spatiotemporal resolution": Should this be "at the mesoscale, and at a spatial temporal resolution"?
  7. Line 25: "researchers already use UAV-SfM to monitor beach dynamics": Can you include a couple of example references?
  8. Line 26-27: "UAV-SfM technology is mature and reliable enough": I think this wording is too strong and you could say instead "More recently, UAV-SfM technology is being increasingly used for wider-scale..."
  9. Line 28: "monitoring program mobilies more than": Should this be "monitoring program has mobilized more than"
  10. Line 29-32: Suggest splitting the last sentence of the paragraph into two, the word "which" is used twice.

Statement of Need

  1. Line 39: QGIS should be all caps. Can also consider adding a citation.
  2. Line 42: "multitemporal DSMs is usally approached": Should be "multitemporal DSMs are usually approached"
  3. Line 42: "dem of difference method": Should be "DEM of difference method"
  4. Line 44: "from time to time": Should be "between time intervals"
  5. Line 44: "two pre and post raster": Remove "two" as it could be a little ambiguous - are the two pre rasters and two post rasters?
  6. Line 53: Suggest the following replacement in the last sentence of the paragraph: "....beach wracks (macroalgae, woody debris), which requires time consuming manual processing to remove or filter to avoid biasing sediment volumetric computations."
  7. Line 76: Define the BCDs acronym here, and perhaps a sentence or two saying what they are and why they could be useful? I think you already have some text in your jupyter notebooks you could use here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.