GithubHelp home page GithubHelp logo

paloukari / orcadetector Goto Github PK

View Code? Open in Web Editor NEW
19.0 19.0 10.0 76.12 MB

A VGGish-based DNN trained on the Watkins Marine Mammal Sound Database, with transfer learning from Audioset, to detect multiple marine mammal species.

License: MIT License

Python 0.17% Jupyter Notebook 99.83% Shell 0.01%
audioset dnn docker-container gpu tx2 vggish whoi

orcadetector's People

Contributors

mwinton avatar paloukari avatar ram-iyer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

orcadetector's Issues

6/25 project intro for class

  • title
  • what does audio look like
  • melspectorgrams
  • dataset of audio samples
  • VGGish model
  • system architecture
  • real world: hydrophone / simulation: proxy w/ triggered samples

Capture noise data

Record noise data in random segments over the course of a day or two.

Download the data

@paloukari @mwinton @ram-iyer
Hello,
It seems that the data are no more available:
C:\Users\quentin.hamard>aws s3 cp s3://w251-orca-detector-data/data.tar.gz ./
fatal error: An error occurred (404) when calling the HeadObject operation: Key "data.tar.gz" does not exist
C:\Users\quentin.hamard>aws s3 cp s3://w251-orca-detector-data/vggish_weights.tar.gz ./
fatal error: An error occurred (404) when calling the HeadObject operation: Key "vggish_weights.tar.gz" does not exist
C:\Users\quentin.hamard>aws s3 cp s3://w251-orca-detector-data/orca_weights_616776.hdf5 ~/OrcaDetector/results/orca_weights_latest.hdf5
fatal error: An error occurred (404) when calling the HeadObject operation: Key "orca_weights_616776.hdf5" does not exist

I would like to re-use the CNN you trained to use it as a feature extractor.

live feed proxy

we set up a proxy for the live stream. tx2 connects to this to read audio input.

later we plug in audio sample overlays.

Add a brief /vggish/README.md

Add a brief README.md file in the ./orca_detector/vggish/ directory crediting the original Google project that the code in that directory came from.

Add support for running test

We shouldn't be regularly running against our test set, but eventually will need to add support for running test with a trained model.

Need to improve performance of the Keras generators

Right now, too much computation is done real-time as the generator tries to load an individual sample (and it all appears to happen on the CPU). We probably need to pregenerate numpy arrays and save to hdf5 files which can be loaded w/o additional processing.

Set up train/val/test split of our downloaded data

Unfortunately, I don't think we can ask Keras to just do it's own train/val split when we use model.fit_generator(). We will have to implement our own validation generator (as the initial codebase indicates), but that involves us doing an initial split of our dataset.

I'd suggest creating a directory structure that has /data/train, /data/val, and /data/test directories at the top level, with subdirectories for the various species classes.

Probably a 70/20/10 stratified split? We can use sklearn.model_selection.train_test_split().

But before we do this, we need to have decided which of the species will get the "Other" label.

Decide which species to explicitly classify

We need to decide which species to explicitly classify -- the N species with the most samples, where I'm thinking N is ~3-4. Then we should classify everything else as "Other" (aka. random sea noises of animals we don't care about).

Update to support shorter (2 sec?) audio clips

Right now, everything's working with 5 sec clips, but I ran into matrix dimensionality mismatch errors in the model when trying to drop to 2 sec. But this is work that's probably worth doing, as it would give us more training examples.

Record loss and accuracy after each training epoch; generate plots

By plugging in to the Keras callback framework, we can record train/val loss and accuracy after each training epoch, and then generate plots after each training run.

NOTE: we can take this code directly from Ram's and my 266 project and plug it in here. It doesn't need new development.

Additional EDA plots

  1. Apply the resampling logic from our code to see what the audio waveforms look like after resampling. (The current value of the SAMPLE_RATE constant in mel_params.py is 16000.)

  2. Generate a mel spectrogram of the resampled audio and plot that. That way we will be able to visualize the image in the same format that the model will train on. I think that will help us pick some species to classify which show some visual distinction.

Set up MLFlow logging server for recording results from experimental runs

I can set this up (I've done it several times). It makes it easy to keep track of training runs -- we can push parameters, metrics, and artifacts (e.g. trained weights, loss plots, etc...) to the logging server. It makes it much easier to keep track of experimental runs and retrieve data or assets associated with them if/when we need it later.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.