GithubHelp home page GithubHelp logo

alexa / alexa-end-to-end-slu Goto Github PK

View Code? Open in Web Editor NEW
21.0 21.0 10.0 16 KB

This setup allows to train end-to-end neural models for spoken language understanding (SLU).

License: Apache License 2.0

Python 100.00%

alexa-end-to-end-slu's People

Contributors

amazon-auto avatar markus-amzn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

alexa-end-to-end-slu's Issues

dataset

Hi, could you send me the dataset: complete.csv, I cannot run the code

DataParallel wrapper doesn't allow access to model attributes

Running the code with '--distributed' flag raises an error because in experiments/experiment_triplet.py, the default DataParallel is used as a wrapper around 'model':
(line 67-68)
if args.distributed and torch.cuda.is_available() and torch.cuda.device_count() > 1:
self.model = torch.nn.DataParallel(self.model)

However, later when self.model.bert is called, the DataParallel object cannot access a model's attributes. I think a custom wrapper has to be implemented, that will called self.model.module.{bert}.

Forward algorithm issue - models/model.py

Hi,
I am running your code with some custom-created splits for the FSC dataset. Running lines 123 and 135 of forward() and forward_text(), respectively, in the models/model.py file causes the following error message:
Traceback (most recent call last):
File "train.py", line 35, in <module>
runner.train()
File ".../alexa-end-to-end-slu/experiments/experiment_base.py", line 68, in train
train_loss, train_acc = self.train_step(batch)
File ".../alexa-end-to-end-slu/experiments/experiment_base.py", line 121, in train_step
metrics = self.compute_loss(batch)
File ".../alexa-end-to-end-slu/experiments/experiment_triplet.py", line 111, in compute_loss
output_pos = self.model(input_text=batch['encoded_text2'],
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File ".../alexa-end-to-end-slu/models/model.py", line 108, in forward
return **self.forward_text(input_text, text_lengths)**
File ".../alexa-end-to-end-slu/models/model.py", line 137, in forward_text
**text_logits = self.classifier(text_embedding)**
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/functional.py", line 1368, in linear
**if input.dim() == 2 and bias is not None:**
**AttributeError: 'str' object has no attribute 'dim'**

This is because text_embedding is actually saved as the key 'pooler_output' returned in the BertModel forward(). Perhaps these lines should be:
_, text_embedding = self.bert(input_ids=input_text, attention_mask=attn_mask)[:2]

Thanks in advance.

Requests for clarifications on fsc and snips

We are a group of NYU MS in Data Science students who are working on developing an end-to-end speech-to-intent model. We have read your paper and replicated your code and would love to ask you some questions.

Paper vs. Github Results Discrepancy
We notice that the final test accuracies for both FSC and SNIPS are different in your paper (ie. 97.65% for FSC, 73.49% for SNIPS) and the github repo (ie. 95.65% for FSC, 69.88% for SNIPS). Can you share some thoughts on the difference between the number in the paper and git repo?

SNIPS Data Partition Ambiguity
In prepare_snips.py, we notice that you split complete.csv into train-val-test. However, since we don’t have this complete.csv that you used, we can’t replicate the exact same partitions. Our results from running your code on our SNIPS dataset using our own splits are significantly higher on average: we ran 4 times (each time using our own splits of shuffled complete.csv), and average accuracy is 81.17% though we use the same environment mentioned in your git repo. We’d love to double check with you on these points.

  • Which subsets of the SNIPS dataset did you use to create the complete.csv? Our guess is that you used smartLight close-field and far-field (3320 observations) for your experiments (ie. are these the data listed in your complete.csv). Please let us know if that’s incorrect.

  • Would you mind sharing your complete.csv and intents.json for SNIPS with us? We believe having the input data in the same format/split is important to draw a fair comparison between yours and our future work.

BERT Embeddings Fine-tuned or Not.
Section 2.1 of your paper says “we back-propagate the embedding and SLU task losses only to the acoustic branch” because you think fine-tuning BERT will lead to overfitting. From this line, our understanding was that the BERT embeddings would be frozen. However, we’ve noticed this piece in the code where the parameters were passed into the Adam optimizer with the learning rate 2e-5 (line 63 in experiment_triplet.py), implying that BERT embeddings would be fine-tuned.
self.optimizer = torch.optim.Adam([ {'params': self.model.bert.parameters(), 'lr':args.learning_rate_bert}, {'params': self.model.speech_encoder.parameters()}, {'params': self.model.classifier.parameters()} ], lr=args.learning_rate)

We would appreciate it if you can give us clarification on whether BERT is fine-tuned and, if so, the reason you chose to fine-tune BERT. Furthermore, in the case where BERT’s parameters are not frozen, could you share some thoughts on fine-tuning BERT for 20 epochs (default in the code), which may lead to overfitting and hurting the text embeddings? As mentioned in other papers about BERT, the typical number of epochs for fine-tuning BERT is 5 at max.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.