GithubHelp home page GithubHelp logo

alessandrogiannetti / esn-audiorecognition Goto Github PK

View Code? Open in Web Editor NEW
1.0 4.0 1.0 5.08 MB

This repository is for the final project of the Neural Network course (Master of Engineering in Computer Science - Sapienza University of Rome).

Python 100.00%
neural-network esn audio-recognition

esn-audiorecognition's Introduction

ESN-AudioRecognition

Neural Network Project

Audio classification has been a growing field lately since better and cheaper sensors are becoming widely available. The amount of applications for this technology has already proven to be staggering, from events monitoring to speech recognition the amount of practical uses keeps growing [1]. In an effort to continue exploring the use of machine learning techniques for environmental sound classification, our contribution focuses on the implementation of a Recurrent Neural Network (RNN) [2] to classify active machinery in construction sites.

In detail, this paper will describe our first approach at implementing an Echo State Network using a plethora of audio data collected on-site to establish if this technology can be effective in the recognition of different kinds machinery sounds.

Echo State Networks (ESN) are very simple to implement and are readily provided by many libraries on the network. The aim of the project is to test whether ESNs can be useful tools for the audio data classification problem. For this purpose we developed an application for the classification of different types of construction vehicles and tools, through an Echo State Network. The proposed approach consists in splitting the audio data into fragments and sampling them into spectrogram representation to then be classified into the different types of machinery. This approach exhibited great potential in environmental sound classification (ESC) achieving significant accuracy.

  • M. Lukosevicius and H. Jaeger, “Overview of reservoir recipes”, School of Engineering and Science, Jacobs University, Technical Report No.11, 2007.
  • Danilo P. Mandic, Jonathon A. Chambers, “Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability", August 2001
  • Mantas Lukosevicius, “A Practical Guide to Applying Echo State Networks”, Jacobs University Bremen
  • “Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network", Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley 2018 IEEE International Conference on Acoustics
  • Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto “librosa: Audio and Music Signal Analysis in Python”, Proc. of the 14th Python in Science Conf. (SCIPY 2015)
  • Alessandro Maccagno, Andrea Mastropietro, Umberto Mazziotta, Michele Scarpiniti, Yong-Cheol Lee, and Aurelio Uncini “A CNN Approach for Audio Classification in Construction Sites”, Sapienza University of Rome, https://github.com/AndMastro/WreckingNet
  • EasyESN, https://github.com/kalekiu/easyesn
  • E. Alpaydin, Introduction to Machine Learning, Mit Press, 3rd Ed., 2014.
  • D Ruta, B Gabrys, Information fusion, Elsevier 2005.

esn-audiorecognition's People

Contributors

alessandrogiannetti avatar edoardobini95 avatar marcoferraro94 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

edoardobini

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.