GithubHelp home page GithubHelp logo

Comments (11)

mansimov avatar mansimov commented on July 20, 2024

Hi @aarzchan

Can you write here the script that you are running to reproduce image captioning results? Likely, some flag is missing/incorrectly set. Thanks!

And yes the output format is correct.

from dl4mt-nonauto.

aarzchan avatar aarzchan commented on July 20, 2024

For the AR model, I'm running:
python run.py --dataset mscoco --params big --load_vocab --mode test --n_layers 4 --ffw_block highway --debug --load_from mscoco_models_final/ar_model --batch_size 1024

For the NAR model, I'm running:
python run.py --dataset mscoco --params big --use_argmax --load_vocab --mode test --n_layers 4 --fast --ffw_block highway --debug --trg_len_option predict --use_predicted_trg_len --load_from mscoco_models_final/nar_model --batch_size 1024

from dl4mt-nonauto.

mansimov avatar mansimov commented on July 20, 2024

@aarzchan

I just downloaded pretrained mscoco models and data from main branch and ran the scripts that you have written here and I can reproduce previous results:

Here is the output for Autoregressive model
mscoco_ar_bleu

Here is the output for Non-Autoregressive model
mscoco_nar_bleu

The flags and the script you tried are correct and I think there is likely an issue with the data.

Have you set correct paths to data here (lines 44 - 56) in your directory https://github.com/nyu-dl/dl4mt-nonauto/blob/master/run.py#L418 ? Also you didn't recreate vocab for MSCOCO, and used the vocab.pkl file that was already provided ?

Also for mscoco models the --vocab_size 10000 but it shouldn't affect it since it force you to load the already created vocabulary https://github.com/nyu-dl/dl4mt-nonauto/blob/master/run.py#L418

Let me know how I can help!

from dl4mt-nonauto.

aarzchan avatar aarzchan commented on July 20, 2024

Thanks for looking into this for me!

Yes, I did change the paths in lines 44 - 56 of data.py to my own dataset path. Otherwise, there would be a runtime error at that part.

I didn't make any modifications to the MSCOCO dataset. All I did was download it and unzip it.

Also, I forgot to mention that, when I initially ran the script, I got the following error in line 46 of model.py:
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'mat2'

I fixed this by adding channels = channels.float() right before this line. Not sure if this affects the results at all.

from dl4mt-nonauto.

mansimov avatar mansimov commented on July 20, 2024

@aarzchan

Yes the change you made on line 46 affects the results for some reason and makes performance much worse. I tried it myself after upgrading to Pytorch 1.0 and seeing the results you have attached at the beginning of this github thread.

Would you mind temporarily running the code in the environment with Pytorch 0.3 or 0.4 while I try to figure out why this change causes problems ?

from dl4mt-nonauto.

aarzchan avatar aarzchan commented on July 20, 2024

@mansimov

Oh, I see. Actually, I was using PyTorch 0.4.1 to run the code from the multigpu branch.

For PyTorch 0.3.1 on the multigpu branch, I get this PyTorch-version-related error:

Traceback (most recent call last):
  File "run.py", line 667, in <module>
    names=["test."+xx for xx in names], maxsteps=None)
  File "/home/aarchan/dl4mt-nonauto_multigpu/decode.py", line 207, in decode_model
    with torch.no_grad():
AttributeError: module 'torch' has no attribute 'no_grad'

However, for PyTorch 0.3.1 for the main branch, I'm able to reproduce the same results you attached earlier:

AR model:
iter 1 | BLEU = 23.47, 68.3/33.2/16.3/8.2

NAR model:

iter 1 | BLEU = 20.12, 66.9/29.6/13.2/6.3
iter 2 | BLEU = 20.87, 67.2/30.4/13.9/6.7
iter 3 | BLEU = 21.04, 67.2/30.6/14.0/6.8
iter 4 | BLEU = 21.12, 67.2/30.6/14.1/6.9

I guess the MSCOCO experiments were not updated in the multigpu branch?

from dl4mt-nonauto.

mansimov avatar mansimov commented on July 20, 2024

Good to hear that you managed to reproduce it.
Yes I haven't looked that into detail on running multigpu branch on MSCOCO dataset. Mainly used multigpu branch for wmt14 en-de experiments

I was using Pytorch 0.4.0 on master branch to reproduce MSCOCO experiments.

I will keep you updated once this issue is figure out
Thanks

from dl4mt-nonauto.

aarzchan avatar aarzchan commented on July 20, 2024

Got it. Thanks again for your help!

from dl4mt-nonauto.

aarzchan avatar aarzchan commented on July 20, 2024

@mansimov

I was looking at the MSCOCO image captioning results in some other image captioning papers, and I noticed your AR model's BLEU-4 score (8.2) is much lower than other papers' models (most recent papers report 30+ BLEU-4). For example, an older paper, Karpathy & Fei-Fei, 2015, reports BLEU-4 scores of 23.0 and 10.0 for their model and a nearest neighbor baseline, respectively.

I understand that the purpose of your experiment was to show that NAR is much faster than AR while getting similar performance, but I was wondering why there is such a large performance gap between your AR baseline and other AR models. Please let me know if I'm overlooking something here. Thanks!

from dl4mt-nonauto.

mansimov avatar mansimov commented on July 20, 2024

@aarzchan

BLEU-4 score reported in paper by Karpathy & Fei-Fei and Xu et al http://proceedings.mlr.press/v37/xuc15.pdf is effectively final BLEU score that we report in our paper. See the footnote in Xu et al paper (BLEU-n is the geometric average of the n-gram precision.
For instance, BLEU-1 is the unigram precision, and BLEU-2 is
the geometric average of the unigram and bigram precision). In our case BLEU-4 is not geometric average and only a 4-gram precision that is why our BLEU-4 and theirs BLEU-4 is different.

Also in paper by Xu et al. they say that "we report BLEU4 from 1 to 4 without a brevity penalty" whereas we use brevity penalty.

from dl4mt-nonauto.

aarzchan avatar aarzchan commented on July 20, 2024

@mansimov

Okay, I see. Thanks for the clarification!

from dl4mt-nonauto.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.