GithubHelp home page GithubHelp logo

krishnadn / x-vector-pytorch Goto Github PK

View Code? Open in Web Editor NEW
96.0 5.0 25.0 88.43 MB

Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch

Python 100.00%
x-vector x-vector-pytorch language-recognition language-identification speech

x-vector-pytorch's Introduction

x-vector-pytorch

This repo contains the implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch Paper: https://danielpovey.com/files/2018_odyssey_xvector_lid.pdf Tutorial : https://www.youtube.com/watch?v=8nZjiXEdMH0

Installation

I suggest you to install Anaconda3 in your system. First download Anancoda3 from https://docs.anaconda.com/anaconda/install/hashes/lin-3-64/

bash Anaconda2-2019.03-Linux-x86_64.sh

Clone the repo

https://github.com/KrishnaDN/x-vector-pytorch.git

Once you install anaconda3 successfully, install required packges using requirements.txt

pip iinstall -r requirements.txt

Create manifest files for training and testing

This step creates training and testing files.

python datasets.py --processed_data  /media/newhd/youtube_lid_data/download_data --meta_store_path meta/ 

Training

This steps starts training the X-vector model for language identification

python training_xvector.py --training_filepath meta/training.txt --testing_filepath meta/testing.txt --validation_filepath meta/validation.txt
                             --input_dim 40 --num_classes 8 --batch_size 32 --use_gpu True --num_epochs 100
                             

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. For any queries contact : [email protected]

License

MIT

x-vector-pytorch's People

Contributors

krishnadn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

x-vector-pytorch's Issues

Dataset do not work in 'valid' or 'test' mode

In 'train' mode you make exactly [400,257] data frames.
In other mode you make [x,257] data frames, where x - length of mag phase spectrogramm.

    if mode=='train':
        randtime = np.random.randint(0, mag_T.shape[1]-spec_len)
        spec_mag = mag_T[:, randtime:randtime+spec_len]
    else:
        spec_mag = mag_T

In validating mode you get list of different-sized tensors and of course - error during

features = torch.from_numpy(np.asarray([torch_tensor.numpy().T for torch_tensor in sample_batched[0]])).float()

TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64

I am wonder about you train\valid approach in dataset. Can you repair project and clarify it?

Access to dataset

Hi,
Can you share the drive of your dataset ? i dont have the dataset for the indian languages.

thanks,
Satish

Frame-level Features Type

Hi, nice work!
Got one minor question. In the paper cited, Snyder et. al use MFCC features.
Here, however, it seems like you are using linear spectrograms instead. Is that on purpose?
Are they performing better than MFCCs for the DNN case?

strange results of training and validation

In "aishell"and "Merged_Arabic_Corpus_of_Isolated_Words" datasets, the predicted results looks like a random guess when training, and will always get a same result when validation. Have you encountered this problem? Thanks.
""" The validation results as follows, left: label, right: prediction------
[25] [11]
[39] [11]
[41] [11]
[130] [11]
[2] [11]
"""

about forward function in x_vector.py

I wonder why in forward step, it returns the tdnn1_out.
I suppose it should be commented else the model training will not work?

def forward(self, inputs):
    tdnn1_out = self.tdnn1(inputs)
    **return tdnn1_out**

also, what is the difference to x_vector_Indian_LID.py?
what does _Indian_LID meant?

Thanks

You use TRAIN dataset for validation.

In file training_xvector.py codeline 46

dataloader_val = DataLoader(dataset_train, batch_size=args.batch_size,shuffle=True,collate_fn=speech_collate)

You use dataset_TRAIN dataset for validation. This is clearly copy-paste error.

question about dataset

hi, i have a question .i use windows to run you code ,how to achieve you dataset ? because i see you dataset is .txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.