alexa / alexa-end-to-end-slu Goto Github PK

View Code? Open in Web Editor NEW

21.0 21.0 10.0 16 KB

This setup allows to train end-to-end neural models for spoken language understanding (SLU).

License: Apache License 2.0

Python 100.00%

alexa-end-to-end-slu's People

Contributors

Stargazers

Watchers

Forkers

xinkez corajung ostapen ephrem-eth augvr isabella232 aboomardiiyah lemu-haruka seanpm2001 createcandle

alexa-end-to-end-slu's Issues

dataset

Hi, could you send me the dataset: complete.csv, I cannot run the code

DataParallel wrapper doesn't allow access to model attributes

Running the code with '--distributed' flag raises an error because in experiments/experiment_triplet.py, the default DataParallel is used as a wrapper around 'model':
(line 67-68)
if args.distributed and torch.cuda.is_available() and torch.cuda.device_count() > 1:
self.model = torch.nn.DataParallel(self.model)

However, later when self.model.bert is called, the DataParallel object cannot access a model's attributes. I think a custom wrapper has to be implemented, that will called self.model.module.{bert}.

Forward algorithm issue - models/model.py

Hi,
I am running your code with some custom-created splits for the FSC dataset. Running lines 123 and 135 of forward() and forward_text(), respectively, in the models/model.py file causes the following error message:
Traceback (most recent call last):
File "train.py", line 35, in <module>
runner.train()
File ".../alexa-end-to-end-slu/experiments/experiment_base.py", line 68, in train
train_loss, train_acc = self.train_step(batch)
File ".../alexa-end-to-end-slu/experiments/experiment_base.py", line 121, in train_step
metrics = self.compute_loss(batch)
File ".../alexa-end-to-end-slu/experiments/experiment_triplet.py", line 111, in compute_loss
output_pos = self.model(input_text=batch['encoded_text2'],
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File ".../alexa-end-to-end-slu/models/model.py", line 108, in forward
return **self.forward_text(input_text, text_lengths)**
File ".../alexa-end-to-end-slu/models/model.py", line 137, in forward_text
**text_logits = self.classifier(text_embedding)**
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/aostapen/miniconda3/envs/cs2/lib/python3.8/site-packages/torch/nn/functional.py", line 1368, in linear
**if input.dim() == 2 and bias is not None:**
**AttributeError: 'str' object has no attribute 'dim'**

This is because text_embedding is actually saved as the key 'pooler_output' returned in the BertModel forward(). Perhaps these lines should be:
_, text_embedding = self.bert(input_ids=input_text, attention_mask=attn_mask)[:2]

Thanks in advance.

Could you please share your dataset?

Requests for clarifications on fsc and snips

We are a group of NYU MS in Data Science students who are working on developing an end-to-end speech-to-intent model. We have read your paper and replicated your code and would love to ask you some questions.

Paper vs. Github Results Discrepancy
We notice that the final test accuracies for both FSC and SNIPS are different in your paper (ie. 97.65% for FSC, 73.49% for SNIPS) and the github repo (ie. 95.65% for FSC, 69.88% for SNIPS). Can you share some thoughts on the difference between the number in the paper and git repo?

SNIPS Data Partition Ambiguity
In prepare_snips.py, we notice that you split complete.csv into train-val-test. However, since we don’t have this complete.csv that you used, we can’t replicate the exact same partitions. Our results from running your code on our SNIPS dataset using our own splits are significantly higher on average: we ran 4 times (each time using our own splits of shuffled complete.csv), and average accuracy is 81.17% though we use the same environment mentioned in your git repo. We’d love to double check with you on these points.

Which subsets of the SNIPS dataset did you use to create the complete.csv? Our guess is that you used smartLight close-field and far-field (3320 observations) for your experiments (ie. are these the data listed in your complete.csv). Please let us know if that’s incorrect.
Would you mind sharing your complete.csv and intents.json for SNIPS with us? We believe having the input data in the same format/split is important to draw a fair comparison between yours and our future work.

BERT Embeddings Fine-tuned or Not.
Section 2.1 of your paper says “we back-propagate the embedding and SLU task losses only to the acoustic branch” because you think fine-tuning BERT will lead to overfitting. From this line, our understanding was that the BERT embeddings would be frozen. However, we’ve noticed this piece in the code where the parameters were passed into the Adam optimizer with the learning rate 2e-5 (line 63 in experiment_triplet.py), implying that BERT embeddings would be fine-tuned.
self.optimizer = torch.optim.Adam([ {'params': self.model.bert.parameters(), 'lr':args.learning_rate_bert}, {'params': self.model.speech_encoder.parameters()}, {'params': self.model.classifier.parameters()} ], lr=args.learning_rate)

We would appreciate it if you can give us clarification on whether BERT is fine-tuned and, if so, the reason you chose to fine-tune BERT. Furthermore, in the case where BERT’s parameters are not frozen, could you share some thoughts on fine-tuning BERT for 20 epochs (default in the code), which may lead to overfitting and hurting the text embeddings? As mentioned in other papers about BERT, the typical number of epochs for fine-tuning BERT is 5 at max.

alexa / alexa-end-to-end-slu Goto Github PK

alexa-end-to-end-slu's People

Contributors

Stargazers

Watchers

Forkers

alexa-end-to-end-slu's Issues

dataset

DataParallel wrapper doesn't allow access to model attributes

Forward algorithm issue - models/model.py

Could you please share your dataset?

Requests for clarifications on fsc and snips

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs