Comments (10)
Hi @renziver,
it seems like the validation batch is empty, could you please doublecheck that the path to the validation data is valid? The training log should report the size of the validation data, maybe that can help us to debug.
from joeynmt.
Hi @juliakreutzer,
I checked the config file to see if the path is valid and it is indeed correct as verified by the training log:
I checked the files as well to make sure that they aren't empty and they're not.
from joeynmt.
Thanks, @renziver, I'll take a look. Maybe something broke through the last batch multiplier update. Could you please try with (eval_
)batch_type: "sentence"
, and batch_size
something around 64?
from joeynmt.
Hi @juliakreutzer,
I changed the batch type to sentence and the batch size as well and a new error showed up:
2020-06-24 04:33:01,547 Epoch 1 Step: 3900 Batch Loss: 4.006904 Tokens per Sec: 6760, Lr: 0.000300
2020-06-24 04:33:12,622 Epoch 1 Step: 4000 Batch Loss: 3.290356 Tokens per Sec: 6911, Lr: 0.000300
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/main.py", line 41, in
main()
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/main.py", line 29, in main
train(cfg_file=args.config_path)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/training.py", line 653, in train
trainer.train_and_validate(train_data=train_data, valid_data=dev_data)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/training.py", line 378, in train_and_validate
batch_type=self.eval_batch_type
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/prediction.py", line 98, in validate_on_data
batch, loss_function=loss_function)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/model.py", line 133, in get_loss_for_batch
trg_mask=batch.trg_mask)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/model.py", line 80, in forward
trg_mask=trg_mask)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/model.py", line 117, in decode
trg_mask=trg_mask)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/decoders.py", line 510, in forward
x = self.pe(trg_embed) # add position encoding to word embedding
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/renz_baliber_senti_com_ph/joeynmt/joeynmt/transformer_layers.py", line 159, in forward
return emb + self.pe[:, :emb.size(1)]
RuntimeError: The size of tensor a (5141) must match the size of tensor b (5000) at non-singleton dimension 1
I checked if it has something to do with my validation pair but they have the equal numbers of instances:
renz-iver:~/joeynmt$ wc -l data/one2many/valid.bpe.src
5600 data/one2many/valid.bpe.src
renz-iver:~/joeynmt$ wc -l data/one2many/valid.bpe.tgt
5600 data/one2many/valid.bpe.tgt
from joeynmt.
Hi! Do you have a sentence that is longer than 5000? The position embeddings might be limited to 5000. if that's the issue, you can make them go up to 6000 or so.
from joeynmt.
Hi @bastings
Should I do that by increasing the embedding dimensions / hidden size of the transformer?
from joeynmt.
Hi,
Please change 5000 here to 10000 or so:
https://github.com/joeynmt/joeynmt/blob/master/joeynmt/transformer_layers.py#L131
And let us know if that helps.
(Also, you are really feeding a sequence that's that long?)
from joeynmt.
Hi @bastings i will take another look in the data to check why the filtering step didn't work, I included filtering sentences longer than 100 so i would have to double check it. thank you for the help.
from joeynmt.
It is indeed an error in my filtering step. Training's now working on a sentence type batch. Thank you @juliakreutzer and @bastings
from joeynmt.
hi, @renziver, i meet the same runtime error in training:
RuntimeError: The size of tensor a (12805) must match the size of tensor b (5000) at non-singleton dimension 1
.
In my config yaml, i set max_sent_length: 300
, i want to how you find the error in your filtering step?
from joeynmt.
Related Issues (20)
- Multi-GPU training. HOT 5
- JoeyNMT v1 procedure is no more compatible with JoeyNMT v2 HOT 12
- better config validation
- "AutocastCPU only supports Bfloat16" error when following rnn_reverse tutorial HOT 5
- RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) HOT 1
- AttributeError: module 'packaging' has no attribute 'version' HOT 2
- Unit test FAIL: testSentencepieceTokenizer (test.unit.test_tokenizer.TestTokenizer) HOT 4
- trg_mask generate problem HOT 4
- Running build_vocab.py for wmt17_bpe with or without --joint? HOT 3
- RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) HOT 5
- run predict function in Colab, get ConfigurationError: Invalid `batch_type` option.
- (enhancement) Deploying trained models on HuggingFace Space HOT 2
- Basic iwslt config train failure due to directory errors HOT 1
- Early stopping criteria is only checked for the `ReduceLROnPlateau` scheduler HOT 5
- Link in Tutorial to Collab dead HOT 4
- Tutorial - Test Set Evaluation HOT 5
- Columns and DataType Not Explicitly Set on line 387 of datasets.py
- Unit Test Fails - Windows Installation HOT 4
- serving & ONNX compat ?
- Implementing Knowledge distillation HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from joeynmt.