GithubHelp home page GithubHelp logo

2021's Introduction

ASVspoof 2021 Baseline CM & Evaluation Package

By ASVspoof2021 challenge organizers

Baseline CMs

Four baseline CMs are available for LA, PA, and DF tracks

  • Baseline-CQCC-GMM (Matlab & Python)
    CQCC feature extraction with GMM classifier
  • Baseline-LFCC-GMM (Matlab & Python)
    LFCC feature extraction with GMM classifier
  • Baseline-LFCC-LCNN (PyTorch)
    LFCC feature extraction with LCNN classifier (DNN)
  • Baseline-RawNet2 (PyTorch)
    End-to-End DNN classifier

Evaluation tools (using the full set of keys and meta-labels!)

eval-package contains tools to compute min t-DCFs and EERs:

  • a script that downloads the full set of keys and meta-labels,
  • a set of Python scripts that computes pooled and decomposed min t-DCFs and EERs,
  • a notebook that computes min t-DCFs and EERs in an interactive way.

Please check eval-package/README for more details on key and meta label files.

You can also manually download the full set of keys and meta-labels:

Link MD5
LA https://www.asvspoof.org/asvspoof2021/LA-keys-full.tar.gz 037592a0515971bbd0fa3bff2bad4abc f052cc2ed276745afa3b5198665d3b26
PA https://www.asvspoof.org/asvspoof2021/PA-keys-full.tar.gz a639ea472cf4fb564a62fbc7383c24cf
DF https://www.asvspoof.org/asvspoof2021/DF-keys-full.tar.gz dabbc5628de4fcef53036c99ac7ab93a

(LA package is updated to remove an unnecessary file called trial_list.txt, 2023/04/13)

Reference

Please consider citing the following papers:

  • ASVspoof 2021 summary paper on Arxiv (submitted to IEEE/ACM Trans. ASLP)
Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, Héctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, and Kong Aik Lee. ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild. arXiv. doi:10.48550/ARXIV.2210.02437. 2022.


@misc{https://doi.org/10.48550/arxiv.2210.02437,
author = {Liu, Xuechen and Wang, Xin and Sahidullah, Md and Patino, Jose and Delgado, H{\'{e}}ctor and Kinnunen, Tomi and Todisco, Massimiliano and Yamagishi, Junichi and Evans, Nicholas and Nautsch, Andreas and Lee, Kong Aik},
doi = {10.48550/ARXIV.2210.02437},
mendeley-groups = {self-arxiv},
publisher = {arXiv},
title = {{ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild}},
url = {https://arxiv.org/abs/2210.02437},
year = {2022}
}
Héctor Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Jose Patino, Md Sahidullah, Massimiliano Todisco, Xin Wang, and others. ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan. ArXiv Preprint ArXiv:2109.00535. 2021.

@article{delgado2021asvspoof,
author = {Delgado, H{\'{e}}ctor and Evans, Nicholas and Kinnunen, Tomi and Lee, Kong Aik and Liu, Xuechen and Nautsch, Andreas and Patino, Jose and Sahidullah, Md and Todisco, Massimiliano and Wang, Xin and Others},
journal = {arXiv preprint arXiv:2109.00535},
title = {{ASVspoof 2021: Automatic speaker verification spoofing and countermeasures challenge evaluation plan}},
year = {2021}
}
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, and Héctor Delgado. ASVspoof 2021: Accelerating Progress in Spoofed and Deepfake Speech Detection. In Proc. ASVspoof Challenge Workshop, 47–54. doi:10.21437/ASVSPOOF.2021-8. 2021.

@inproceedings{yamagishi21_asvspoof,
author = {Yamagishi, Junichi and Wang, Xin and Todisco, Massimiliano and Sahidullah, Md and Patino, Jose and Nautsch, Andreas and Liu, Xuechen and Lee, Kong Aik and Kinnunen, Tomi and Evans, Nicholas and Delgado, H{\'{e}}ctor},
booktitle = {Proc. ASVspoof Challenge workshop},
doi = {10.21437/ASVSPOOF.2021-8},
pages = {47--54},
title = {{ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection}},
year = {2021}
}
Tomi Kinnunen, Hector Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, and Douglas A Reynolds. Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28. IEEE: 2195–2210. doi:10.1109/TASLP.2020.3009494. 2020.

@article{kinnunen2020tandem,
author = {Kinnunen, Tomi and Delgado, Hector and Evans, Nicholas and Lee, Kong Aik and Vestman, Ville and Nautsch, Andreas and Todisco, Massimiliano and Wang, Xin and Sahidullah, Md and Yamagishi, Junichi and Reynolds, Douglas A},
doi = {10.1109/TASLP.2020.3009494},
issn = {2329-9290},
journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
pages = {2195--2210},
publisher = {IEEE},
title = {{Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals}},
volume = {28},
year = {2020}
}

2021's People

Contributors

asvspoof-challenge avatar tonywangx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

2021's Issues

Pre-trained model for PA task

Hello,

Seems the pre-trained model provided for PA is the one that is trained for LA task. Could you please point me to a link for downloading pre-trained RawNet2 for PA please?

Thanks,
Ali

Dataset_ASVspoof2021_eval 返回的为什么是x_inp, key 而不是返回x_inp, label[key]

class Dataset_ASVspoof2021_eval(Dataset):
    def __init__(self, list_IDs, base_dir):
        """
            self.list_IDs	: list of strings (each string: utt key),
        """

        self.list_IDs = list_IDs
        self.base_dir = base_dir

    def __len__(self):
        return len(self.list_IDs)

    def __getitem__(self, index):
        self.cut = 64600  # take ~4 sec audio (64600 samples)
        key = self.list_IDs[index]
        X, fs = librosa.load(self.base_dir + 'flac/' + key + '.flac', sr=16000)
        X_pad = pad(X, self.cut)
        x_inp = Tensor(X_pad)
        return x_inp, key

Applying the pre-trained baseline model on my own dataset

Hi again,
I am trying to apply the pre-trained DF baseline model (B03) on my own dataset. I have this error:
"
[91mNo input features found after scannning�[0m
�[91mPlease check ['/content/drive/MyDrive/MySelfVersionProjects/audio_sample/sample_set_3_split/Asvspoof2021_baseline_3_on_our_dataset/train_for_asvspoof2021_baseline']�[0m
�[91mThey should contain all files in file list�[0m
�[91mPlease also check filename extentions ['.flac']�[0m
�[91mThey should be correctly specified�[0m
�[91mError: Failed to read input features�[0m"

The extension of my audio clips is .wav
Should I change them into .flac ?
What should I do to prepare my data for letting the model extract/read the input features?

Missing data in ASVspoof 2019 LA track?

It has been 4 years and I hope that someone would realize this too: the line counts listed in cm_protocols do not match with the number of .flac files (in dev and eval) sub-datasets. Please see the screenshot below 👇

(while I'm fine with data missing, my bigger concern is that: did this inconsistency cause any labeling issue, e.g., audio x is spoofed instead of bona fide because of this. I hope not)

image

2 channel audios

Hi,
I am trying to apply DF-LFCC-LCNN baseline pre trained model, I get this error:
ValueError: could not broadcast input array from shape (88200,2) into shape (88200,1)
How can I modify the code, and where should I modify, to let the baseline process the 2 channel audios?
I use
!bash 02_eval_alternative.sh /content/drive/MyDrive/path_to_the_dataset modelname /content/drive/MyDrive/path_to/df_trained_network.pt
Thank you

The loss function of "LA/Baseline-RawNet2"

As i read from "the read document of pytorch", the "CrossEntropyLoss() function" is criterion combintion of nn.LogSoftmax() and nn.NLLLoss(). Why there is still a nn.LogSoftmax() layer in the last layer of RawNet() and if i remove the nn.LogSoftmax() there would be Bad results.

Duplicate samples

Hello, we figured out some samples look duplicated in the dataset. We wonder if they are completely the copied version of each other? For example, PA_E_1035160.flac and PA_E_1018196.flac, and more ...
Have you used a type of oversampling? Any specific reasons behind these duplicates?
Thank you!

> Missed parameters of fs, fmin, fmax in extract_cqccc

Missed parameters of fs, fmin, fmax in extract_cqccc

return extract_cqcc(file)

After fixing this error, string indices must be integers error occurs in

Xcq = cqt(sig[:, None], B, fs, fmin, fmax, 'rasterize', 'full', 'gamma', gamma)

Hello, has your problem been solved? I also encountered this problem.

Originally posted by @yuant-gif in #2 (comment)

Where to download PA_cm_protocols

Hello, thanks for the code. I am running the baseline of PA. The evaluation requires a file named "ASVspoof2021_PA_cm_protocols/ASVspoof2021.PA.cm.eval.trl.txt", but I cannot find it anywhere. Has it been released and where should I access it? Thanks a lot.

math representation of the LFCCs features

In the baseline 03- DF task the LFCCs features' setup is as below:
LFCC features are extracted using a 20 ms window with a 10 ms shift, a 1024-point Fourier transform and 70 filters. These features include 19 static cepstra plus energy, delta and delta-delta coefficients.
Is the math representation below correct for them?
image

torch._six has been removed

Torch does not seem to support torch._six anymore and it has been removed.
Refer - pytorch/pytorch#94709

Asvspoof 2021 baseline model still has dependency on it.

(I am trying the baseline below: https://github.com/asvspoof-challenge/2021/blob/main/DF/Baseline-LFCC-LCNN/project/baseline_DF/02_eval_alternative.sh

#Script to quickly evaluate a evaluation set in DF task)

I get this error:
ModuleNotFoundError: No module named 'torch._six'

I tried:
from torch import inf
nothing changed!
How should I fix it?

Evaluation

How to evaluate the ERR scores in the evaluation set?

Unable to download ASVSpoof Challenge dataset 2021

Hi,
For the last 2 days, I have been trying to download the dataset - LA.zip from both :
-Zenodo site ( https://zenodo.org/record/4837263#.YiHDfHNBy5c )
-University of Edinburgh data share ( https://datashare.ed.ac.uk/handle/10283/3336 )

The download is extremely slow ( in spite of my broadband download speed of 30Mbps ) and gets cancelled mid way.

Could you let me know if there are any other servers that this data is present on and can be downloaded quickly.

Thanks in advance.

Question regarding the datasets

Hello,
In the DF and LA tasks, is there any way to know each audio clip is Voice Conversion or not?
I mean are Voice Conversion audio clips distinguished from Text-to-Speech ones?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.