GithubHelp home page GithubHelp logo

mzouros / vehicle_tracker Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 6.56 MB

Given an audio sample, predict how many vehicles have passed during its duration

License: MIT License

Jupyter Notebook 100.00%

vehicle_tracker's Introduction

Project: Vehicle Tracker

This repo contains a single Jupyter Notebook (Python) for the Machine Learning in Multimedia Data Course at the MSc Program of AI 2020-2022, organized by NCSR Demokritos and University of Piraeus. The purpose of this study is to collect audio signals from different static places near a road, extract useful information from these signals and implement different types of Machine and Deep Learning algorithms, in order to train them to recognize, given a new audio signal, how many vehicles have passed during the signal’s duration.

Instructors: Mr. Theodoros Giannakopoulos - tygiannak
                   Mr. Maglogiannis Ilias - imaglo

Notebook Sections

Imports

Data Collection

  • Approximately over 200 audio signals were collected near various types of roads (one lane, two lanes, boulevard, highway, crossroads, etc), but only 172 were compliant and used for this study
  • Vehicles recorded could be of any type (motorcycles, cars, vans, trucks, buses, etc.) and size, as long as they have an engine (eg. bicycles are not calcuated)
  • The recording were made via our smartphones' microphones and had a duration that lasted just over 30 seconds
  • Each sample was named in accordance to its label (numeric), indicating the vehicles that passed by, plus an incremental index indicating the number of the recording (eg. 21_recording137)
  • Recording Example (original, transformed, augmented): https://github.com/mzouros/vehicle_tracker/tree/main/RecordingExample

Data Preparation

  • Data preparation took place on the Audacity application and included:
    • Audio trimming to exactly 30 seconds
    • Noise reduction where possible (eg. reduce the volume of birds singing or people talking)
    • Stereo to Mono transformation
    • Conversion to .wav

Algorithmic Approaches

  • Machine Learning:
    • SVC
  • Deep Learning:
    • CNN
    • LSTM
  • In all three approaches, we extracted the labels from the audio signals names and we created 8 labels corresponding to approximately +5 vehicles each time. So our labels were ranging from 1-5 (label 0) to 35+ (label 7) vehicles detected.

Feature Extraction and Data Augmentation

  • Feauture extraction and augmentation was different for each of our three approaches:
    • Mel Spectograms for the CNN model. Augmentation via filtering/masking our Mel Spectograms on both their axes (mel scale, time)
    • MFCCs for the LSTM model. Augmentation via new audio signal generation, using various sound augmentation techniques (White Noise, Time and Pitch Shifting)
    • Time (Root Mean Square Energy, Zero Crossing Rate) and Frequency (Spectral Centroid, Spectral Bandwidth) Domain Features for the SVC algorithm. Augmentation via new audio signal generation, using Pitch Shifting (up, down)

Original VS White Noise Added VS Time Shifted VS Pitch Shifted Soundwave

alt text alt text alt text alt text

Mel VS Masked Mel Spectogram

alt text alt text

Implementation and Results

  • Tried with different kernels, regularization parameters (C) and kernel’s coefficients (gamma) for our SVC algorithm
  • Tried with different architectures, model sizes (layers, nodes), batch sizes, kernel & stride sizes (CNN) and number of epochs for our NN models
  • Tried most of overfitting avoidance techniques for our Deep Learning algorithms (kernel regularization, batch normalization, dropout, early stopping)

CNN Model

alt text

CNN Accuracy & Loss (30 epochs)

alt textalt text

LSTM Model

alt text

LSTM Accuracy & Loss (35 epochs)

alt textalt text

SVC Model Complexity

alt text

SVC Confusion Matrix

alt text

SVC Predictions

alt text

Discussion

  • The results suggest that our two NN models don’t perform as well as our SVC algorithm
  • The LSTM model seems to perform much better than the CNN during training, but still faces problems during the prediction stage
  • Our SVC algorithm seems to have the best prediction results
  • Prediction seems way harder in samples with lots of vehicles passing by
  • Small and noisy datasets are better to be approached via the traditional machine learning techniques and algorithms

Future Work

The study can be extended in a number of different ways:

  • Better recording devices with noise reduction, different types of roads
  • It can be implemented using a Regression approach
  • It can be extended from vehicle detection to vehicle detection and classification
  • A combination of both acoustic and image/video data could yield far more better prediction results

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

vehicle_tracker's People

Contributors

mzouros avatar evangeliabaou avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.