GithubHelp home page GithubHelp logo

sound-cnn's Introduction

sound-cnn

A convolutional neural network that classifies sounds. This is accomplished by taking raw audio information and converting it into spectogram information. Each spectogram is a "picture" of the sound, which the CNN learns to classify in the same way that traditional image recognition paradigms work.

For more info on how this was accomplished, and how it compares to other methods, read the following Medium post.

###Setup pip install -r requirements.txt

Training

The model can be trained with the following arguments:

$ python train.py 'bpm' 'sampling rate' 'audio path' 'iterations' 'batch size'

bpm is dependent on the sound files being classified

sampling rate is most often set to 44100.

audio path is the directory where the audio files are located. The program will read each file in the directory as a separate sound class, for example: if the directory has two files file1.wav and file2.wav, then there will be two classes that the CNN will attempt to learn to identify.

iterations should vary depending on the difficulty of the classification. 1000 ~ 5000 may be ideal for most situations.

batch size is most often set to 100 ~ 200.

Test

This tests trained model with follwoing arguments:

$ python test.py 'bpm' 'sampling rate' 'audio path'

This requires the same number of class files used in training. For example, if you trained two classes like class1.wav, class2.wav, you need to two classes file in the test.

Prediction

This predict unlabeled dataset using trained model with following arguments:

$ python test.py 'bpm' 'sampling rate' 'audio path' 'the number of classes

the number of classes should be the same number of classes used in training.

This conducts only prediction. Thus, this does not require label. Thus, the files don't need to be splited as class.

####Example

audio/train/
  class1.wav
  class2.wav
audio/test/
  class1.wav
  class2.wav
audio/prediction/
  class1-class2.wav

python train.py 240 44100 audio/train/ 1000 150

python test.py 240 44100 audio/test/

python test.py 240 44100 audio/prediction/ 2

sound-cnn's People

Contributors

awjuliani avatar ghsdh3409 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.