This project focuses on the classification of animal sounds using deep learning. The core idea is to utilize audio processing techniques and a fine-tuned version of the hubert-base-ls960 model to accurately classify different animal sounds. This application could serve various purposes, from ecological monitoring to educational software.
The dataset used for training is a subset of the ESC-50 dataset, specifically filtered to include only animal sound categories such as dog, cat, rooster, and more. This filtered dataset allows for a more focused approach to animal sound classification.
data_preprocessing.py
: This script is used for preprocessing the audio data from the ESC-50 dataset. It filters out the required animal sounds and prepares them for training.train_model.py
: This script contains the code for training the classification model using the preprocessed data.README.md
: Provides an overview of the project, installation instructions, and how to run the scripts.requirements.txt
: Lists all the necessary Python packages required to run the project.
To set up this project, follow these steps:
-
Clone the repository:
git clone https://github.com/rawbeen248/audio_classification_finetuning
-
Navigate to the project directory:
cd Audio_Classification_Finetuning
-
Install the required packages:
pip install -r requirements.txt
First, run the data preprocessing script to prepare your dataset:
python data_preprocessing.py
Then, you can train the model by running:
python train_model.py
The fine-tuned model is available on Hugging Face and can be accessed through the following link: Animal Sound Classification
You can use this model directly from Hugging Face Model Hub for audio classification tasks involving the identified animal sounds.