GithubHelp home page GithubHelp logo

mlecardonnel / drumsheet Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 881 KB

Transcription project from drum audios into drum sheets.

License: Apache License 2.0

Jupyter Notebook 61.86% Makefile 0.12% Python 33.63% LilyPond 4.38%

drumsheet's Introduction

DrumSheet

Transcription project from drum audios into drum sheets.

logo

The goal of the DrumSheet project is to create a tool that returns the drumming score sheet from any given drumming audio. The tool is based on image recognition to classify the types of percussion.

Dataset

To create the classification models I used the annoted MedleyDB dataset.

It consists of drum annotations and audio files for 23 tracks. For each of the tracks drum only, full mix and the original multi-track wav files are included. Two annotation files are provided for each track. I choosed to work with the first annotation file that groups the 7994 onsets into 6 classes based on drum instrument.

The audio and annotation files are published under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Audio segmentation process

The segmentation of the drum audio wav files is done by detecting the percussions. Threshold values are chosen for the signal amplitudes to detect where are the percussions. With a defined window size, the signal segments are transformed into spectrograms. The spectrograms give an image representation of the percussions on which we can perform image recognition. Bellow is the spectrogram of a kick drum stroke :

KD

Classification method

As several strokes can occur at the same time on different parts of the drums, I choosed to create a model for each following part : cymbal, hi-hat, kick drum, snare drum. The models are binary classifiers that indicates whether or not the drum part is struck given a spectrogram. Bellow are the performance results on the validation dataset :

Models Well detected percussions Wrongly detected percussions
CY_60epochs 69.7% 2.4%
HH_ResNet 93.4% 2.3%
KD_ResNet 94.8% 0.5%
SD_ResNet 93.4% 2.6%
AVERAGE 87.8% 1.95%

I used the ResNet50V2 model for the hi-hat, the kick drum and the snare drum classifiers as it gave me great performances. However I kept my simple convolutional neural network for the cymbal because the ResNet increased the percentage of wrongly detected percussions despite well increasing the percentage of well detected percussions. The performance are worse because there are less cymbal strokes in the dataset.

Drum audio transcription

From a given drum audio wav file, the tool processes the segmentation of the signal where it detects strokes and then transforms the segments into spectrograms. Each spectrogram goes through all the models to predict which drums parts are struck. It then returns the times of the different strokes with a one hot encoding indicating the types of percussions. Bellow is an example for the 80sRock track from the MedleyDB dataset :

transcription

Score sheet generation

I use Lilypond to generate the drumming score sheets. First I create .ly files that I convert using the Lilypond application to get the pdf score sheets. Bellow is the score sheet for the 80sRock track from the MedleyDB dataset :

score

Future improvements

Improve the classification models for better performances.

References

[1] C. Southall, C. Wu, A. Lerch, J. Hockman, MDB Drums - An Annotated Subset of MedleyDB for Automatic Drum Transcription, Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR), 2017.

drumsheet's People

Contributors

mlecardonnel avatar

Stargazers

Gustavo León Tramontin avatar HAL-9031 avatar Foufi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.