GithubHelp home page GithubHelp logo

linhung0319 / ismir2018-revisiting-svd Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kyungyunlee/ismir2018-revisiting-svd

0.0 1.0 0.0 11.11 MB

Revisiting Singing Voice Detection : a Quantitative Review and the Future Outlook

Python 100.00%

ismir2018-revisiting-svd's Introduction

Revisiting Singing Voice Detection : a quantitative review and the future outlook

This repo contains code for the paper "Revisiting Singing Voice Detection: a Quantitative Review and the Future Outlook" by Kyungyun Lee, Keunwoo Choi and Juhan Nam at the 19th International Society for Music Information Retrieval Conference (ISMIR) 2018. [pdf, blog post]

Requirements

  • specified in requirements.txt

Public Dataset

  • Jamendo with the same labeling, train/valid/test set split as described in the website.
  • MedleyDB
    We used 61 songs that contain vocals, which can be found in medleydb_vocal_songs.txt.
    Note : MedleyDB does not provide vocal annotations, so we generated labels using the provided instrument activation annotation.
    Download the songs, change path, and run python medley_voice_label.py to generate labels for the 61 songs.

Dataset for stress testing (section 5)

To generate dataset, run

  • python vibrato_data_gen.py for vibrato test in section 5.1.
  • python snr_data_gen.py for SNR test in section 5.2. (Requires modification for path to MedleyDB vocal containing songs.)

Reproduction of singing voice detection models (section 3)

There are 3 reproduced models in the following folders :

  • lehner_randomforest [1]
  • schluter_cnn [2]
  • leglaive_lstm [3]
    Note : Set paths for datasets in each config files within the model folders

Commandline arguments are :

  • --model_name : whatever name you set it during training, and will be saved in ./weights/ folder.
  • --dataset : one of {"jamendo", "vibrato", "snr"}. New dataset can be added with modification in load_data.py (might add RWC pop).

In each model folder, audio processor to preprocess data must be run before playing around with the model.

  • python audio_processor.py --dataset "jamendo" in CNN and RNN model with {"jamendo", "vibrato", "snr"}
  • python vocal_var.py --dataset "jamendo"" in randomforest model with {"jamendo", "vibrato", "snr"}
    Note : This file for randomforest computes vocal variance and concatenates them with the features extracted from the matlab code provided by the authors of [1]. So, this file only provides functions for computing the vocal variance. Either you can add onto this file to compute other features or you can find the matlab code ;)

To train models, run the following in each model folder

  • python main.py --model_name "mynewmodel"

To run pretrained models (models are provided in ./weights/ folder), run the following in each model folder

  • python test.py --model_name "mynewmodel" --dataset "jamendo"

References

  • [1] Bernhard Lehner, Gerhard Widmer, and Reinhard Sonnleitner. "On the reduction of false positives in singing voice detection." pdf
  • [2] Jan Schlueter and Thomas Grill. "Exploring data augmentation for improved singing voice detection with neural networks." pdf
  • [3] Simon Leglaive, Romain Hennequin, and Roland Badeau. "Singing voice detection with deep recurrent neural network." pdf

TO DO (2018.06)

  • Upload notebook file for model analysis and audacity compatible label generation.

ismir2018-revisiting-svd's People

Contributors

kyungyunlee avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.