GithubHelp home page GithubHelp logo

sfu-bigdata / range-driver Goto Github PK

View Code? Open in Web Editor NEW
1.0 5.0 2.0 124.69 MB

Acoustic telemetry detection data analysis

Home Page: https://sfu-bigdata.github.io/range-driver

Dockerfile 0.28% Python 99.67% Shell 0.04%
acoustic-telemetry data-science

range-driver's People

Contributors

git-steb avatar jillianderson8 avatar sampasha avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

range-driver's Issues

Reduce Workflow Data Download Time

Look into making our GitHub Actions documentation workflow more efficient. Currently, we spend almost the entire time (10-45 minutes) downloading data files from SFU Vault. It would be great if we could speed this process up. A few potential solutions (I haven't explored these in depth yet):

  • Cache the data download using the actions/cache@v2 action.
  • We could switch the pipeline completely. Instead of running the notebooks here, we could run the notebooks on push to the range-driver-tutorials repository and upload them to a file storage location (or a separate branch??). That way, the build process would be relatively quick here. And the data would only be downloaded when the notebooks have changed, not each time the main code base changes.
  • Look into whether there are benefits to using git lfs instead of SFU Vault .

Additionally, we are currently caching an anaconda environment. We could move away from anaconda completely (particularly since kadlu no longer requires it) or find a better way to cache the environment.

Sphinx Template

Switch over to the Meridian Read The Docs theme. When we do this, we will also need to update the template to reflect SFU Branding guidelines (check email from Michael on this).

The meridian theme's GitLab repo has been made public here

High/Low Detection Rate Comparisons

Add plots/reports that allow users to classify detection rates into "high" and "low" rates (either manually or automatically) and then compare the differences in the environmental variables for these two conditions.

We could use a hard threshold, where all detection rates are classified into one of the two buckets. Alternatively, we could leave out detection rates that exist on the boundary between the two groups, effectively creating three buckets (high, med, low) but only comparing 2 of the buckets (high & low).

Tree-Based Prediction of Detection Rates

Add a tutorial to run through how to create a tree-based prediction model for detection rates to give feature importance.
We will need to create a tutorial in the range-driver-tutorials repository and add it to the Sphinx documentation in this tutorial.

I don't think this will require any additions to the code base, just the tutorials and documentation.

Grouping/Ungrouping of Detection Rates for Plotting

Currently, our analytics plots each have fixed grouping. For example, the detection rate curves use receiver/transmitter groups. Alternatively, heat-maps combine all detection rates before plotting.

We want to add the option for detection rates to be grouped at varying levels for these analyses. For example, users should be able to combine analysis according to distance (near vs far), power (low vs high), receiver/transmitter groups, etc.

Multiple Bounds

Implement support for multiple boundary specification when working with multiple data sources. This will affect the structure of the configuration files, since we will need a way to link a data source to a specific bounds object.

An example of why we need this is our NSOG case study. In this case study, two of our data sources (era5 & nemo) exist at very different levels of granularity. This means that using a single boundary would result in either way too much nemo data (an unnecessary and infeasible amount of data to download) or way too little era5 data (not enough data to make interpolation).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.