This repository accompanies Jonathan Valyou's honors thesis entitled "Nonnegative Matrix Factorization for Music - Tuning the NMF Algorithm with Regularization" which can be located in the Emory University Honors Thesis Archives at the following link: https://etd.library.emory.edu/concern/etds/np193b40x?locale=en. Much of this code was adapted from the NMF Toolbox repository: https://www.audiolabs-erlangen.de/resources/MIR/NMFtoolbox/#Python. This repository explores how Nonnegative Matrix Factorization(NMF) can be utilized in source separation in music and introduces Regularized NMF algorithm code to aid in denoising audio and promoting sparsity.
git clone https://github.com/B3jonathanv/NMF_Music_Git.git
If this repository is utilized and/or any information contained within the thesis entitled "Nonnegative Matrix Factorization for Music - Tuning the NMF Algorithm with Regularization", please include the following citation:
Valyou Jonathan. Nonnegative Matrix Factorization for Music - Tuning the NMF Algorithm with Regularization. Emory Theses and Dissertations Repository. 3 May 2022.
The spectrogram X is the bottom right corner with axes of time(seconds) and frequency(Hz). The visual representation of the frequency data for each source W is in the bottom left of the diagram. The visual representation of the temporal data for each source H is in the top right corner of the diagram.This is the source separation performed on the eight notes of the C Major Scale to demonstrate how NMF can separate note pitches on the same instrument. The NMFD algorithm was utilized with input parameters of 8 sources for 8 distinct pitches, 300 iterations, 8 Template Frames, randomly initialized W, and uniformly initialized H. Below is a convergence analysis of this source separation example.
This is source separation performed on an audio of three percussion instruments to demonstrate how NMF can separate instruments of distinct frequency ranges. The NMFD algorithm was utilized with input parameters of 3 sources for 3 distinct instruments, 30 iterations, 8 Template Frames, randomly initialized W, and uniformly initialized H. The three colors represent each of the three percussion instruments: red represents the kick drum, green represents the snare drum, and blue represents the ride cymbal.
This is the source separation performed on a recording of Ein feste Burg ist unser Gott to demonstrate how NMF handles source separation for more complex,polyphonic musical arrangements. The NMFD algorithm was utilized with input parameters of 8 sources for 8 distinct pitches, 200 iterations, 8 Template Frames, randomly initialized W, and uniformly initialized H.
This is the source separation performed on a short recording of a person coughing over a sustained symphony note to demonstrate how NMF can separate out distinct non-uniform, non-Gaussian noise from an audio. The NMF algorithm with no regularization parameter was utilized with input parameters of 2 sources for the music and the noise, 200 iterations, randomly initialized W, and uniformly initialized H. Blue corresponds with the coughing noise and red corresponds with the orchestra.
The below example demonstrates how Regularized NMF can aid source separation of an audio file that is perturbed by random Gaussian noise.
run Reg_Script.py
The NMF algorithm with no regularization parameter was utilized with input parameters of 8 sources for 8 distinct pitches, 50 iterations, randomly initialized W, and uniformly initialized H.
The Regularized NMF algorithm with regularization expression γ∥H∥_1 was utilized with input parameters of a regularization parameter of γ = 5×10^−6, 8 sources, 50 iterations, randomly initialized W, and uniformly initialized H.
Patricio Lopez-Serrano, Christian Dittmar, Yigitcan Ozer, and Meinard Muller. NMF Toolbox, 2019.
Patricio Lopez-Serrano, Christian Dittmar, Yigitcan Ozer, and Meinard Muller. NMF Toolbox: Music processing applications of nonnegative matrix factorization. In Proceedings of the International Conference on Digital Audio Effects (DAFx), Birmingham, UK, September 2019.
Paris Smaragdis and J. Brown. Non-negative matrix factor deconvolution; ex- traction of multiple sound sources from monophonic inputs. volume 3195, 09 2004.
Daniel D. Lee and H. Sebastian Seung. Learning the parts of objects by non- negative matrix factorization. Nature, 401(6755):788–791, 1999.
Stanford University Department of Music. Sound examples.