GithubHelp home page GithubHelp logo

cvqluu / factorized-tdnn Goto Github PK

View Code? Open in Web Editor NEW
143.0 8.0 34.0 285 KB

PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi

License: MIT License

Python 100.00%
kaldi tdnn tdnn-f pytorch speech-recognition speaker-recognition acoustic-model neural-network neural-networks speaker-diarization

factorized-tdnn's People

Contributors

cvqluu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

factorized-tdnn's Issues

Pre-processing and training data

Hello!

Thank you so much for the great work you've shared, would it possible for you to share the pre-processing methodology and the training data you worked with as a demo.

It would be great help.

Thank you!

Question about the correct way to enforce semi orthogonality

Hey man, and thank you for this great repository, it saved me from implementing and debugging TDNN/FTDNN and instead I can focus on experiments. I have one question regarding the correct way to step the optimizer, because it seems that I have been doing it wrong until now. After reading the "usage" part of the README it seems to me that the correct way to step during the training is to do something like that: assuming that the most basic way to train network in pytorch is to:

for input, target in dataset:
    optimizer.zero_grad()
    output = model(input)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()

and you wrote

tdnn_f.step_semi_orth() # The key method to constrain the first two convolutions, perform after every SGD step

Than I should add this call tdnn_f.step_semi_orth() after the optimizer.step() call, making it like this, right?

for input, target in dataset:
    optimizer.zero_grad()
    output = model(input)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()
    tdnn_f.step_semi_orth()

Thank you for your time if you happen to answer this question!

FTDNN failed to convergence

Hello. I use the FTDNN model in your model.py to train the VoxCeleb1 dataset but the model works terribly compared to the original TDNN system, it can only reach 40% acc after 100 epochs however TDNN can reach nearly 100%. And the loss is still relatively high after 100 epochs. It clearly failed to reach convergence. Have you tried this on Voxceleb1 dataset before? I hope to know the configuration. Thank you .

FTDNN trainning example- Loss function

Hello,
I am trying to include the FTDNN model that you wrote in the framework pytorch-kaldi. I managed to implement the model ut I have some issues when training which I suspect are related to the input and output I am using.
Could you possibly provide the code that you used for the demonstration of FTDNN or what king of Loss function should be used at the end ?
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.