GithubHelp home page GithubHelp logo

omarun / aer-cnn-keras Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shahnawax/aer-cnn-keras

0.0 2.0 0.0 92.71 MB

Convolutional Neural Network based implementation of Audio Event Recognition in KERAS

Python 100.00%

aer-cnn-keras's Introduction

AER-CNN-KERAS

This project is a proof of the concept implementation of a Convolutioinal Neural Network (CNN) based implementation of Audio Event Recognition (AER) in KERAS. Keep in mind that it is the first version by the author. As it is, it is not the best model available for this prupose. The work on the model is under progress and any refinements will be updated on the repository.

Tools Required

Python 3.6 is used during development and following libraries are required to run the code provided in the notebook:

  • keras 2.x.
  • numpy
  • librosa

Database used

The ESC-50 dataset is a public labeled set of 2000 environmental recordings (50 classes, 40 clips per class, approximately 5 seconds per clip) suitable for environmental sound classification tasks.

See ESC: Dataset for Environmental Sound Classification - paper replication data for the full paper with a more thorough analysis.

The available sound classes arranged alphabatically are given below:

Experiments

Preprocessing

First of all we renamed all the files in the classes to be numbers from 1 to 40. Then all the files were read and we calculated the dBscale Mel Spectrogram with n_mels = 128. All the rest of the elements are left to be default in librosa.features.melspectrogram. All the files are of different length so in order to make sure that the preprocessed data has equal size for all the files we selected on 300 frames.

Experimentation and data segmentation

We trained our model on all 50 classes. The total data is shuffeled in order to mix the classes and loose patterns. Then the data is divided into 2 subsets. 80% for training and 20% for testing. The training data is then further divided into 2 subsets with randomly selecting approaximately 80% data for trainig and rest of the data for validation. So at the end we have 400 instances for testing (8 files per class), approximately 1280 instance for training and 320 instance for validation.

Results

We tested the model for all the classes and got the overall average accuracy of 52%. We found out that our model performs differently on different classes. We categorized the classes into 3 groups, classes with Very Good Performance (with performance equal or above 75%), Medium Performance (with performance value between 60% and 75%) and Bad Performance (with performance equal or less than 50%). The model performs for each class are reported below.

Very Good Performance

  • Siren : 100.0 %
  • DoorKnock : 100.0 %
  • Clapping : 100.0 %
  • Helicopter : 100.0 %
  • Rain : 87.5 %
  • Rooster : 87.5 %
  • ClockAlarm : 87.5 %
  • CanOpening : 87.5 %
  • PouringWater : 87.5 %
  • HandSaw : 87.5 %

On these classes the accuracy of the model is 92.5% on average.

Medium Performance

  • VacuumCleaner : 75.0 %
  • Dog : 75.0 %
  • Train : 62.5 %
  • CarHorn : 62.5 %
  • Crow : 62.5 %
  • Engine : 62.5 %
  • BrushingTeeth : 62.5 %
  • Frog : 62.5 %
  • Cow : 62.5 %
  • KeyboardTyping : 62.5 %
  • Insects : 62.5 %
  • SeaWaves : 62.5 %
  • ChurchBells : 62.5 %
  • Sheep : 62.5 %

Average performance for these classes is 64.29%.

Bad Performance

  • Crickets : 50.0 %
  • GlassBreaking : 50.0 %
  • Coughing : 50.0 %
  • Pig : 50.0 %
  • Thunderstorm : 50.0 %
  • CracklingFire : 50.0 %
  • ToiletFlush : 50.0 %
  • WaterDrops : 37.5 %
  • Crying baby : 37.5 %
  • Fireworks : 37.5 %
  • Hen : 37.5 %
  • Cat : 37.5 %
  • DrinkingSipping : 37.5 %
  • Laughing : 25.0 %
  • Chainsaw : 25.0 %
  • Breathing : 25.0 %
  • Sneezing : 25.0 %
  • WashingMachine : 25.0 %
  • Snoring : 12.5 %
  • ClockTick : 12.5 %
  • DoorWoodCreaks : 12.5 %
  • ChirpingBirds : 12.5 %
  • MouseClick : 12.5 %
  • Footsteps : 12.5 %
  • Wind : 0.0 %
  • Airplane : 0.0 %

Average performance for these classes is 29.80%.

Running instructions

Follow following steps to use this code.

  • Download the dataset and unzip into the Samples directory.
  • Keep only the 50 subdirectories for different events and delete all other files in the Samples.
  • Run rename.py to rename the files in the sub directories to name them 1 to 40.wav.
  • Run preprocess_data.py to preprocess the data, this will generate the files and directories in the Preproc sub directory.
  • Finally run the train_network.py. This will load the preprocessed data from PreProc directory and will create X_test, Y_test, X_validation, Y_validation, X_test and Y_test variables for the training. Then will train the network and save the X_test variable along with Y_test, pre-trained model model.h5 and class labels Class_names.npy.
  • evaluate_network.py evaluates the pretrained network and prints the performance for each class.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.