sfu-bigdata / range-driver Goto Github PK

View Code? Open in Web Editor NEW

1.0 5.0 2.0 124.69 MB

Acoustic telemetry detection data analysis

Home Page: https://sfu-bigdata.github.io/range-driver

Dockerfile 0.28% Python 99.67% Shell 0.04%

acoustic-telemetry data-science

range-driver's People

Contributors

Stargazers

Watchers

Forkers

ocean-tracking-network softwaremonk

range-driver's Issues

Reduce Workflow Data Download Time

Look into making our GitHub Actions documentation workflow more efficient. Currently, we spend almost the entire time (10-45 minutes) downloading data files from SFU Vault. It would be great if we could speed this process up. A few potential solutions (I haven't explored these in depth yet):

Cache the data download using the actions/cache@v2 action.
We could switch the pipeline completely. Instead of running the notebooks here, we could run the notebooks on push to the range-driver-tutorials repository and upload them to a file storage location (or a separate branch??). That way, the build process would be relatively quick here. And the data would only be downloaded when the notebooks have changed, not each time the main code base changes.
Look into whether there are benefits to using git lfs instead of SFU Vault .

Additionally, we are currently caching an anaconda environment. We could move away from anaconda completely (particularly since kadlu no longer requires it) or find a better way to cache the environment.

Sphinx Template

Switch over to the Meridian Read The Docs theme. When we do this, we will also need to update the template to reflect SFU Branding guidelines (check email from Michael on this).

The meridian theme's GitLab repo has been made public here

High/Low Detection Rate Comparisons

Add plots/reports that allow users to classify detection rates into "high" and "low" rates (either manually or automatically) and then compare the differences in the environmental variables for these two conditions.

We could use a hard threshold, where all detection rates are classified into one of the two buckets. Alternatively, we could leave out detection rates that exist on the boundary between the two groups, effectively creating three buckets (high, med, low) but only comparing 2 of the buckets (high & low).

Tree-Based Prediction of Detection Rates

Add a tutorial to run through how to create a tree-based prediction model for detection rates to give feature importance.
We will need to create a tutorial in the range-driver-tutorials repository and add it to the Sphinx documentation in this tutorial.

I don't think this will require any additions to the code base, just the tutorials and documentation.

Grouping/Ungrouping of Detection Rates for Plotting

Currently, our analytics plots each have fixed grouping. For example, the detection rate curves use receiver/transmitter groups. Alternatively, heat-maps combine all detection rates before plotting.

We want to add the option for detection rates to be grouped at varying levels for these analyses. For example, users should be able to combine analysis according to distance (near vs far), power (low vs high), receiver/transmitter groups, etc.

Multiple Bounds

Implement support for multiple boundary specification when working with multiple data sources. This will affect the structure of the configuration files, since we will need a way to link a data source to a specific bounds object.

An example of why we need this is our NSOG case study. In this case study, two of our data sources (era5 & nemo) exist at very different levels of granularity. This means that using a single boundary would result in either way too much nemo data (an unnecessary and infeasible amount of data to download) or way too little era5 data (not enough data to make interpolation).

sfu-bigdata / range-driver Goto Github PK

range-driver's People

Contributors

Stargazers

Watchers

Forkers

range-driver's Issues

Reduce Workflow Data Download Time

Sphinx Template

High/Low Detection Rate Comparisons

Tree-Based Prediction of Detection Rates

Grouping/Ungrouping of Detection Rates for Plotting

Multiple Bounds

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs