Comments (8)
Hi,
I forgot to add in README
that you need to use my modified torchtext that supports num_gpus
argument https://github.com/mansimov/pytorch_text_multigpu
I will update the README
. Also try using PyTorch 0.4.* for consistency
Can you try it and let me know ?
from dl4mt-nonauto.
Thank you for sharing the code!
I tried running your model with multiple GPU settings as follows, and I got an error from BucketIterator.
It seems that BucketIterator (from torchtext) does not acceptnum_gpus
argument.
I am using torchtext (version 0.3.1).python run.py --dataset iwslt-ende --vocab_size 40000 --ffw_block highway --params small --lr_schedule anneal --fast --valid_repeat_dec 8 --use_argmax --next_dec_input both --denoising_prob 0.5 --layerwise_denoising_weight --use_distillation --num_gpus 3 2019-02-20 16:05:59 INFO: - random seed is 19920206 2019-02-20 16:05:59 INFO: - TRAINING CORPUS : /work01/kiyono/dl4mt-nonauto-data/iwslt/en-de/distill/ende/train.tags.en-de.bpe 2019-02-20 16:06:02 INFO: - before pruning : 195897 training examples 2019-02-20 16:06:02 INFO: - after pruning : 195897 training examples Traceback (most recent call last): File "run.py", line 572, in <module> num_gpus=args.num_gpus) TypeError: __init__() got an unexpected keyword argument 'num_gpus'
Do you have any ideas about how to avoid this error?
you can checkout the branch to "multigpu".
from dl4mt-nonauto.
I am already using multigpu
branch (commit: e15acb2).
In https://github.com/nyu-dl/dl4mt-nonauto/blob/multigpu/run.py#L536-L537, there is num_gpu
argument, which is not available in torchtext.data.BucketIterator
(https://torchtext.readthedocs.io/en/latest/data.html#torchtext.data.BucketIterator)
from dl4mt-nonauto.
Thank you for your reply!
I will try the modified version and see what happens.
from dl4mt-nonauto.
Thank you very much for sharing, and I would like to ask that, how we can run the code for the performance consistent with paper, specifically the IWSLT 16-ENDE experiment. I've tried to run it, but BLEU is always below that of the paper about five to six. Could you give us a set of specific settings for IWLT-ENDE by the way? Thank you very much.
from dl4mt-nonauto.
Off the top of my head, try running the following script in the main branch
python run.py --dataset iwslt-ende --vocab_size 40000 --load_vocab --ffw_block highway --params small --batch_size 2048 --eval_every 1000 --lr_schedule anneal --fast --valid_repeat_dec 20 --use_argmax --next_dec_input both --denoising_prob --layerwise_denoising_weight --use_distillation
After training it you need to train the length prediction module by running above script with --load_from
with specified trained model and --resume --trg_len_option predict --finetune_trg_len
The script should be similar in multigpu branch
python run.py --dataset iwslt-ende --vocab_size 40000 --load_vocab --ffw_block highway --params small --batch_size 2048 --num_gpus 2 --eval_every 1000 --lr_schedule anneal --fast --valid_repeat_dec 20 --use_argmax --next_dec_input both --denoising_prob --layerwise_denoising_weight --use_distillation
from dl4mt-nonauto.
@mansimov I installed the modified version of torchtext and confirmed that the training actually works.
Thank you again for your advice.
from dl4mt-nonauto.
Great!
@butsugiri & @baoy-nlp feel free to ask me any other questions and update me on your progress!
from dl4mt-nonauto.
Related Issues (14)
- Train loss value computes to zero in every iteration HOT 1
- How is your WMT16 EN-Ro Dataset Preprocessed? HOT 1
- I receive Error for "model.py" HOT 2
- No event loop integration for 'inline'
- RuntimeError: each element in list of batch should be of equal size
- Need the bpe codes files for applying bpe to a new file. HOT 2
- General information about distillation HOT 11
- Reproducing MSCOCO image captioning results HOT 11
- Test data for reproducing IWSLT-16 En-De results HOT 2
- RuntimeError: Error(s) in loading state_dict for FastTransformer: HOT 16
- Is the AR model for NMT tasks transformer? HOT 3
- IWSLT-16 En-De Decoding HOT 1
- different batch_size lead to different results HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dl4mt-nonauto.