GithubHelp home page GithubHelp logo

sota-music-tagging-models's People

Contributors

andreasjansson avatar andrebola avatar minzwon avatar vinods7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sota-music-tagging-models's Issues

Problems training on jamendo - RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0

Hi, thanks for this awesome project! I'm trying to train it with jamendo-moodtheme tags but I'm getting an error.

I'm trying it on a google colab VM with a cuda enabled GPU.

I downloaded the mel-spectrograms from the jamendo repository specyfing the melspecs data type and autotagging_moodtheme dataset. Then in this project I just replaced the TAGS variables in the code with this and the tsv files with the moodtheme ones from here.

Everything looked fine but for some reason I'm receiving the attached error after running the training code.

The mel spectrograms have 92 bands and different lengths, that might be causing problems maybe?

Let me know if anyone knows what might be the problem :)

Thanks in advance!

# My code
%tensorflow_version 1.x
%cd /content/sota-music-tagging-models/src/
!python -u main.py --data_path /content/data --dataset jamendo-mood

My error message

Namespace(batch_size=16, data_path='/content/data', dataset='jamendo-mood', log_step=20, lr=0.0001, model_load_path='.', model_save_path='./../models', model_type='hcnn', n_epochs=200, num_workers=0, use_tensorboard=1)
Traceback (most recent call last):
  File "main.py", line 61, in <module>
    main(config)
  File "main.py", line 39, in main
    solver.train()
  File "/content/sota-music-tagging-models/src/solver.py", line 172, in train
    for x, y in self.data_loader:
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 346, in __next__
    data = self.dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 80, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 80, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 65, in default_collate
    return default_collate([torch.as_tensor(b) for b in batch])
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 20546 and 9168 in dimension 2 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

Bug running fcn with mtat

result = self.forward(*input, **kwargs)
RuntimeError: builtins: link error: Invalid value
The above operation failed in interpreter, with the following stack trace:

The above operation failed in interpreter, with the following stack trace:

Any idea what the problem is?

I have a question

Your model has Mel spectrogram transform.
So if model's forward, that's also work.

I have one question.

Used already makes npy(mel spectrogram) and model using npy vs When the model is operated, make it into a mel spectrogram( your code)
Upper two situation, is there difference?
and how about harmonic transform situation?

My roc_auc is higher than reported

I train harmonicnn on MTAT, where MTAT is downloaded from https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py
I trained 4 times, the evaluation outputs are:
loss: 0.1400
roc_auc: 0.9151
pr_auc: 0.4636

loss: 0.1399
roc_auc: 0.9157
pr_auc: 0.4666

loss: 0.1405
roc_auc: 0.9155
pr_auc: 0.4617

loss: 0.1403
roc_auc: 0.9153
pr_auc: 0.4643

loss: 0.1402
roc_auc: 0.9148
pr_auc: 0.4641

the roc_auc is much higher than the reported 0.9126.
Did I miss something ?

Data splitting of MTAT dataset

Hi, thank you for your great work and for building benchmark results for all the representative auto-tagging models. I have a further question on the data splitting of the MTAT dataset.

In the SMC paper, you mentioned that you did not discard the tracks with no associate labels (which might lead to performance decay). However, both the split npy files in this repo and also the split files in this repo you referred to in the SMC paper discard those tracks away. Could I kindly ask whether the results are based on the cleaned version of the dataset which discard those tracks?

For your reference, the original version should have 18706 tracks for training, 1825 for validation, and 5329 for testing (25860 in total). The clean version should have 15247 for training, 1529 for validation, and 4332 for testing (21108 in total).

How to get the audio files of MSD?

I tried to find a way to download audio files of MSD on millionsongdataset.com but failed. How do I get the audio files of MSD, if I may ask?

Dataset paths

You should either mention that this project requires a very specific structure of dataset directories, and what it is, or treat YOUR_DATA_PATH as a top level dir for a given dataset.

Only one class present in y_true. ROC AUC score is not defined in that case.

Hi,
The model runs, but an error occurs when calculating roc-auc. It says that there is only one label for y_true, and I don't know what the problem is.
I ran the fcn and musicnn models with the jamendo dataset(autotagging-moodtheme), and both gave the same error.
The error message is as below.
(I configured the same environment according to the requirements.)
Could you help me to solve this problem?


Traceback (most recent call last):
File "main.py", line 59, in
main(config)
File "main.py", line 37, in main
solver.train()
File "/nas2/epark/sota-music-tagging-models/training/solver.py", line 182, in train
best_metric = self.validation(best_metric, epoch)
File "/nas2/epark/sota-music-tagging-models/training/solver.py", line 258, in validation
roc_auc, pr_auc, loss = self.get_validation_score(epoch)
File "/nas2/epark/sota-music-tagging-models/training/solver.py", line 316, in get_validation_score
roc_auc, pr_auc = self.get_auc(est_array, gt_array)
File "/nas2/epark/sota-music-tagging-models/training/solver.py", line 244, in get_auc
roc_aucs = metrics.roc_auc_score(gt_array, est_array, average=None)
File "/home/epark/anaconda3/envs/music/lib/python3.7/site-packages/sklearn/metrics/_ranking.py", line 375, in roc_auc_score
sample_weight=sample_weight)
File "/home/epark/anaconda3/envs/music/lib/python3.7/site-packages/sklearn/metrics/_base.py", line 120, in _average_binary_score
sample_weight=score_weight)
File "/home/epark/anaconda3/envs/music/lib/python3.7/site-packages/sklearn/metrics/_ranking.py", line 221, in _binary_roc_auc_score
raise ValueError("Only one class present in y_true. ROC AUC score "
ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Clarification on dataset format(s)

I have two goals:

  • run inference from some of the pre-trained models on my own dataset
  • train a new model on my own dataset(s)

In both cases I am having trouble due to the dataset formats; it seems that the scripts require a very specific format of dataset which is not really detailed in the readme. If you could provide any clarification on how our datasets should be formatted, this would be greatly appreciated! Thanks in advance.

Error in loading the model during the training process

Hi, I'm trying to retrain the ShortChunkCNN model for MagnaTagATune dataset. The training process is error-ring out at the 80th epoch, when the self.load is called from the opt_schedule:

RuntimeError: Error(s) in loading state_dict for ShortChunkCNN:
        size mismatch for spec.mel_scale.fb: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([257, 128]).

Can you tell me why this is happening? Also, can you tell me why the following snippet of code is required in self.load?

if 'spec.mel_scale.fb' in S.keys():
            S['spec.mel_scale.fb'] = torch.tensor([])

Thanks for the nice paper, and neat code!

the small data size training.

Thanks for your work. I trained the musicnn as your suggestion because I only have 1K train data. I finetune the musicnn last ten layer with the pretrained model you provided which is trained with MSD dataset. the result show me that:
屏幕快照 2021-11-02 下午4 33 14
屏幕快照 2021-11-02 下午4 33 18
Does that look right ? thanks in advace.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.