I'm having trouble reproducing the image captioning results, using the pre-trained mod

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Reproducing MSCOCO image captioning results about dl4mt-nonauto HOT 11 CLOSED

nyu-dl commented on July 20, 2024

Reproducing MSCOCO image captioning results

from dl4mt-nonauto.

Comments (11)

mansimov commented on July 20, 2024

Hi @aarzchan

Can you write here the script that you are running to reproduce image captioning results? Likely, some flag is missing/incorrectly set. Thanks!

And yes the output format is correct.

from dl4mt-nonauto.

aarzchan commented on July 20, 2024

For the AR model, I'm running:
python run.py --dataset mscoco --params big --load_vocab --mode test --n_layers 4 --ffw_block highway --debug --load_from mscoco_models_final/ar_model --batch_size 1024

For the NAR model, I'm running:
python run.py --dataset mscoco --params big --use_argmax --load_vocab --mode test --n_layers 4 --fast --ffw_block highway --debug --trg_len_option predict --use_predicted_trg_len --load_from mscoco_models_final/nar_model --batch_size 1024

from dl4mt-nonauto.

mansimov commented on July 20, 2024

@aarzchan

I just downloaded pretrained mscoco models and data from main branch and ran the scripts that you have written here and I can reproduce previous results:

Here is the output for Autoregressive model

Here is the output for Non-Autoregressive model

The flags and the script you tried are correct and I think there is likely an issue with the data.

Have you set correct paths to data here (lines 44 - 56) in your directory https://github.com/nyu-dl/dl4mt-nonauto/blob/master/run.py#L418 ? Also you didn't recreate vocab for MSCOCO, and used the vocab.pkl file that was already provided ?

Also for mscoco models the --vocab_size 10000 but it shouldn't affect it since it force you to load the already created vocabulary https://github.com/nyu-dl/dl4mt-nonauto/blob/master/run.py#L418

Let me know how I can help!

from dl4mt-nonauto.

aarzchan commented on July 20, 2024

Thanks for looking into this for me!

Yes, I did change the paths in lines 44 - 56 of data.py to my own dataset path. Otherwise, there would be a runtime error at that part.

I didn't make any modifications to the MSCOCO dataset. All I did was download it and unzip it.

Also, I forgot to mention that, when I initially ran the script, I got the following error in line 46 of model.py:
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'mat2'

I fixed this by adding channels = channels.float() right before this line. Not sure if this affects the results at all.

from dl4mt-nonauto.

mansimov commented on July 20, 2024

@aarzchan

Yes the change you made on line 46 affects the results for some reason and makes performance much worse. I tried it myself after upgrading to Pytorch 1.0 and seeing the results you have attached at the beginning of this github thread.

Would you mind temporarily running the code in the environment with Pytorch 0.3 or 0.4 while I try to figure out why this change causes problems ?

from dl4mt-nonauto.

aarzchan commented on July 20, 2024

@mansimov

Oh, I see. Actually, I was using PyTorch 0.4.1 to run the code from the multigpu branch.

For PyTorch 0.3.1 on the multigpu branch, I get this PyTorch-version-related error:

Traceback (most recent call last):
  File "run.py", line 667, in <module>
    names=["test."+xx for xx in names], maxsteps=None)
  File "/home/aarchan/dl4mt-nonauto_multigpu/decode.py", line 207, in decode_model
    with torch.no_grad():
AttributeError: module 'torch' has no attribute 'no_grad'

However, for PyTorch 0.3.1 for the main branch, I'm able to reproduce the same results you attached earlier:

AR model:
iter 1 | BLEU = 23.47, 68.3/33.2/16.3/8.2

NAR model:

iter 1 | BLEU = 20.12, 66.9/29.6/13.2/6.3
iter 2 | BLEU = 20.87, 67.2/30.4/13.9/6.7
iter 3 | BLEU = 21.04, 67.2/30.6/14.0/6.8
iter 4 | BLEU = 21.12, 67.2/30.6/14.1/6.9

I guess the MSCOCO experiments were not updated in the multigpu branch?

from dl4mt-nonauto.

mansimov commented on July 20, 2024

Good to hear that you managed to reproduce it.
Yes I haven't looked that into detail on running multigpu branch on MSCOCO dataset. Mainly used multigpu branch for wmt14 en-de experiments

I was using Pytorch 0.4.0 on master branch to reproduce MSCOCO experiments.

I will keep you updated once this issue is figure out
Thanks

from dl4mt-nonauto.

aarzchan commented on July 20, 2024

Got it. Thanks again for your help!

from dl4mt-nonauto.

aarzchan commented on July 20, 2024

@mansimov

I was looking at the MSCOCO image captioning results in some other image captioning papers, and I noticed your AR model's BLEU-4 score (8.2) is much lower than other papers' models (most recent papers report 30+ BLEU-4). For example, an older paper, Karpathy & Fei-Fei, 2015, reports BLEU-4 scores of 23.0 and 10.0 for their model and a nearest neighbor baseline, respectively.

I understand that the purpose of your experiment was to show that NAR is much faster than AR while getting similar performance, but I was wondering why there is such a large performance gap between your AR baseline and other AR models. Please let me know if I'm overlooking something here. Thanks!

from dl4mt-nonauto.

mansimov commented on July 20, 2024

@aarzchan

BLEU-4 score reported in paper by Karpathy & Fei-Fei and Xu et al http://proceedings.mlr.press/v37/xuc15.pdf is effectively final BLEU score that we report in our paper. See the footnote in Xu et al paper (BLEU-n is the geometric average of the n-gram precision.
For instance, BLEU-1 is the unigram precision, and BLEU-2 is
the geometric average of the unigram and bigram precision). In our case BLEU-4 is not geometric average and only a 4-gram precision that is why our BLEU-4 and theirs BLEU-4 is different.

Also in paper by Xu et al. they say that "we report BLEU4 from 1 to 4 without a brevity penalty" whereas we use brevity penalty.

from dl4mt-nonauto.

aarzchan commented on July 20, 2024

@mansimov

Okay, I see. Thanks for the clarification!

from dl4mt-nonauto.

Reproducing MSCOCO image captioning results about dl4mt-nonauto HOT 11 CLOSED

Comments (11)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs