Bag of Frames(KMeans/GMM) for Acoustic Scene Classfication
Python3
MFCC features, which are too compressed to learn compared to VGGish embeddings. So weak classifier is more suitable. Dataset: http://extrasensory.ucsd.edu/
The bag-of-frames approach to audio pattern recognition: A sufficient model for urban soundscapes but not for polyphonic music The Journal of the Acoustical Society of America 122, 881 (2007); https://doi.org/10.1121/1.2750160