I used the kaldi/egs/dihard_2018/v2 recipe to make front-end processing,and get the MF

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How much performance can data augmentation improve？ about pytorch_xvectors HOT 8 CLOSED

manojpamk commented on July 22, 2024

How much performance can data augmentation improve？

from pytorch_xvectors.

Comments (8)

BLack-yzf commented on July 22, 2024

@manojpamk

from pytorch_xvectors.

manojpamk commented on July 22, 2024

If I understood your procedure correctly, you have prepared the training data using Kaldi's dihard recipe, and trained xvectors using this repo, right?

The Kaldi repo reports 26.30% DER using supervised calibiration (https://github.com/kaldi-asr/kaldi/blob/master/egs/dihard_2018/v2/run.sh), while Pytorch xvectors returned similar numbers using spectral clustering (https://github.com/manojpamk/pytorch_xvectors/blob/master/README.md). Note that the dihard recipe uses voxceleb corpora for xvector training.

Now as to why the non-augmented model returned similar DER, I am not sure. It is likely that the clean data is large enough to show significant improvements in this task.

Manoj

from pytorch_xvectors.

BLack-yzf commented on July 22, 2024

Thanks for your relpy.
Sorry to bother you. I have seen you achieve better results, so I want to consult you on a few quetions.
I didn't use your repo to train xvector model. I have reproduced the x-vector model before. The model structure is same as yours. But the training steps are a bit different from yours. I didn't execute the 'prepare for egs' procedure. Instead, I used the MFCC features obtained from the voxceleb copora to train x-vector model directly. But yours are the 'egs'. I think maybe it causes the difference in performence.

Another question, I have seen some resluts in 'diarize.sh'.(https://github.com/manojpamk/pytorch_xvectors/blob/master/egs/diarize.sh). The results on DIHARD2-dev using plda are worse than the kaldi baseline. Is there any trouble on computing plda score?

Yuan

from pytorch_xvectors.

manojpamk commented on July 22, 2024

Preparing features in the egs format mainly assists training - by ensuring samples have the same duration (i.e number of frames) within a batch. Further, samples in egs files are subset from the utterances themselves, so you can think of them as generating multiple equal-duration examples from the same utterance. Note that both kaldi and this repo perform CMVN and remove non-speech frames before egs file preparation.

All things said, I dont think 27% DER is too bad.

I believe the higher PLDA numbers are due to the AHC threshold not optimized - I currently set it to 0.

from pytorch_xvectors.

BLack-yzf commented on July 22, 2024

Thanks a lot. I will add the 'egs' on my experiment.
Hope to ask you more question.
Thanks again!

from pytorch_xvectors.

BLack-yzf commented on July 22, 2024

Hi, Manoji
Sorry to bother you. I don't konw how to make evaluation on AMI dataset. Is there any recipe about it?
Thanks.

from pytorch_xvectors.

BLack-yzf commented on July 22, 2024

@manojpamk

from pytorch_xvectors.

manojpamk commented on July 22, 2024

Hi Yuan,

Do you already have the AMI corpus downloaded?

For audio, check out the kaldi recipe (https://github.com/kaldi-asr/kaldi/blob/master/egs/ami/s5/run_ihm.sh)
I don't know if the RTTMs are available, but they can be created using the segments and utt2spk files prepared using the kaldi recipe.
To evaluate diarization, use this script (https://github.com/manojpamk/pytorch_xvectors/blob/master/egs/diarize.sh) after setting the wavDir and rttmDir variables appropriately.
To determine the train-dev-eval session splits, check out this paper: https://arxiv.org/pdf/1902.03190.pdf

Manoj

from pytorch_xvectors.

How much performance can data augmentation improve？ about pytorch_xvectors HOT 8 CLOSED

Comments (8)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs