GithubHelp home page GithubHelp logo

roman-vygon / triplet_loss_kws Goto Github PK

View Code? Open in Web Editor NEW
92.0 2.0 14.0 4.38 MB

Learning Efficient Representations for Keyword Spotting with Triplet Loss

License: MIT License

Python 97.31% Jupyter Notebook 2.69%
pytorch deep-learning keyword-spotting speech-recognition

triplet_loss_kws's Introduction

Linkedin Badge Gmail Badge


Roman's GitHub stats

triplet_loss_kws's People

Contributors

roman-vygon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

triplet_loss_kws's Issues

Model not Learning

Hi!

First of all, thank you for open souring the code. I have tried to replicate the results and I have found a few issues during the training process.

  1. I have generated a script following the presented notebook to generate dists.npy not present in the source code. The file is 799.9mb long that saves an array of shape (9998, 9998).
  2. The classes probabilites files are missing, I am assigning them to None.
  3. I had to comment the line in l2.py to avoid getting the grad_fn error while training.

After all of that, I can load your pretrained model Res15_35 (as there are no manifests files for 12 yet provided) and I can achieve the accuracy on Triplet evaluation. On the other hand though, there's no learning when training my model from scratch. The command used follows:

python TripletEncoder.py --name=test_encoder --manifest=35 --mode=Res15 --per_class=5 --per_batch=10 --hidden_size=45

Several per_batch and per_class parameters have been tested and same behaviour: The Triplet loss is always oscillating around 1.1 and 0.7 but there's not an evident decrease while training.

image

Then running the infer train script through:
python infer_train.py --name=res15_encoder --manifest=35 --model=Res15 --enc_step=25440 --hidden_size=45

The resulting Avg Accuracy is arround 20-35. This is not happening when loading the pretrained model, do you know what could be happening?

image

Thanks in advanced,

Biel.

Memory Issues

Hello Author,
I'm trying to train the TripletEncoder.py but GPU throws CUDA out of memory error. Can you specify the memory requirements to train this code?

Package version mismatch while running with CUDA

Hi there,

Thank you for the great repo! While installing the packages listed in the Readme file using CUDA, I came across some version conflicts and circular dependencies in packages which are difficult to resolve. If possible, can anyone send through a requirements.txt file? It would make it easier and would be highly appreciated.

Kind Regards

How to generate the libri100.json?

Hi~
I follow your README.md
But I didn't find anywhere to get libri100_train.json, libri100_dev.json, libri100_test.json
to convert LibriWords manifests with convert_path_prefix.ipynb

KNN and Pretrained Models

Hi there,

Thank you for the great repo! Can I please know if you guys have also open sourced the KNN code you have used after triplet loss trained representations and the pre-trained models on triplet loss?

Thank You

Segmentaton Fault problem

After installing the required libraries. I have been trying to execute your pipeline. It is giving Segmentation fault (core dumped) every time. Can you pls share requirement.txt and also ideas to combat this problem?

Pre-trained Res8 models and parameters

Dear authors
With what parameters did you train Res8 with manifest 12?
Could you provide the pre-trained model? It's missing in the run folder.
Thank you for your help!

Google drive link invalid

Dear author, this google link seems invalid, can you just reshare the files 'convert_path_prefix.ipynb'? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.