GithubHelp home page GithubHelp logo

207_final_project's Introduction

207_final_project

Team name:

BirdSong Sonatas

Team members:

Rachel Gao

Amina Alavi

Hamsini Sankaran

Andrew Loeber

Dataset:

https://www.kaggle.com/competitions/birdclef-2023

Objective:

Bird species classifier using audio recordings and machine learning algorithms.

Forewords:

The original BirdClef-2023 competition contained more than 260 bird species in the training dataset. Due to limitation in computing power, we selected 10 species for our project during baseline presentation time. For our final model, we further trimmed down our species selection to 3 species due to limitation in computing power. All files saved in this repo only contains the work done for the 3 species models as presented in the final presentation.

The label for the test dataset as provided by the competition is hidden, and since we trimmed down the number of species to 3 species for our models, we split the provided training dataset to train and test for our own model building purpose.

3 species selected: barswa, comsan, eaywag1

NOTE: ALL WORK WAS DONE IN SHARED GOOGLE DRIVE, THE RESULTS FROM THE NOTEBOOKS CAN BE SEEN IN THE NOTEBOOKS BUT THE DIRECTORY PATH NEED TO BE CHANGED IF YOU WANT TO RERUN THE NOTEBOOKS. FILES LARGER THAN 25MB ARE NOT AVAILABLE IN GITHUB, PLEASE REACH OUT TO US IF YOU WOULD LIKE ACCESS TO THOSE LARGE FILES.

Directories:

0.RAW_Data

All data as provided by the Kaggle competition for the 3 species selected:

  • train_metadata.csv
  • eBird_Taxonomy_v2021.csv
  • Original audio files for each species can be downloaded from Kaggle

1.Train_Test_Split

The training data for the 3 species were split to train (70%) vs test (30%). All EDA and models were built using data from the training set only while the test set is preserved for inference at the end.

  • Train_Test Dataset Construction.ipynb
    • Note: The train/test split was originally done for 10 species, but then we further down-selected 3 species out of the 10 species for our final presentation

2.Preprocessing

  • np_object_extraction: to extract the train and test audio np object from librosa.load() to save downstream processing time
    • Extract_Train_Audio_Object_Librosa.ipynb
    • Extract_Test_Audio_Object_Librosa.ipynb
    • The audio np objects are too large to fit into the repo, the two notebooks can be ran in colab to extract the audio np objects
  • metadata: train & test metadata for downstream tasks
  • train_preprocessing_&_EDA: preprocess training data and perform EDA
    • Prepare Train_Val 8-sec Frames.ipynb: used for ViT's training and validation data preparation
  • test_preprocessing_&_EDA: preprocessing test data and perform minimal EDA

3.Notebooks

Notebooks from each member of the team

4.Inference

Models used for inference & inference results

Inference results presented in final presentation:

  • CNN1D model source: 3.Notebooks/RG/5.CNN1D/5b.8sec_4sec_overlap_with_embedding_continents_CNN1D.ipynb
  • GRU-RNN model source: 3.Notebooks/HS/RNN/GRU/1.b.RNNGRU.ipynb
  • Vision Transformer source: 4.Inference/Vision_Transformer/ViT_8s_stft_inference.ipynb

5.Presentation

  • Baseline
  • Final

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.