Hi authors, I take the provided pretrained 200k checkpoint and did t

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Are you able to solve this issue? <a class="user-mention notranslate" data-hovercard-t

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Flickr30k Finetune results does not match the provided checkpoint about vilt HOT 10 OPEN

dandelin commented on August 23, 2024

Flickr30k Finetune results does not match the provided checkpoint

from vilt.

Comments (10)

dandelin commented on August 23, 2024 1

@JACKHAHA363
Sure, you can grab it here https://www.dropbox.com/s/lcqmbx587szaox3/vilt_100k_wwm_pretrain.ckpt?dl=0 (will be expired someday)

from vilt.

dandelin commented on August 23, 2024

@JACKHAHA363

The fine-tuning results can be unstable due to augmentations. Also, we have only trained the IR/TR fine-tuning models for a single time.
You may increase the training epochs (greater than 10 epochs, maybe 20 epochs?) to get more stable and better results.

from vilt.

JACKHAHA363 commented on August 23, 2024

I tried longer epochs but that end up overfitting with increasing val loss. Would you mind providing the checkpoint for 100k steps also?

from vilt.

JACKHAHA363 commented on August 23, 2024

thanks @dandelin!

from vilt.

yangxiaofeng commented on August 23, 2024

Are you able to solve this issue? @JACKHAHA363 I have similar issues on both flicker and coco retrieval.

from vilt.

byougert commented on August 23, 2024

Hi, bro.
I found ir/tr evaluation result on flickr is still unstable even using official finetuned checkpoint. Sometimes I got 63.94(ir)/83.6(tr), sometimes it changed to 64.3(ir)/83.7(tr). How do you think it? @dandelin @JACKHAHA363

from vilt.

byougert commented on August 23, 2024

Hi.Thanks for your reply. But I find shuffle in IR/TR image loader (code)  is exactly False. In fact, shuffle is not exexplicitly set to False. However, the default value in torch.utils.data.DataLoader is False. Besides, shuffle in dist_sampler is also False. I have no idea about the unsteable result. T_T ...

…

------------------ 原始邮件 ------------------ 发件人: "Wonjae ***@***.***>; 发送时间: 2021年12月30日(星期四) 晚上10:36 收件人: ***@***.***>; 抄送: "by ***@***.***>; ***@***.***>; 主题: Re: [dandelin/ViLT] Flickr30k Finetune results does not match the provided checkpoint (#13) Hi @byougert Interpolating the position embedding dynamically for every image batch can cause subtle fluctuation while doing a batch evaluation. During computing IR/TR, we pass batch of images to vit.visual_embed (code), and inside of vit.visual_embed method, we calculate the maximum width and height given the batch of images dynamically and interpolate the pos_embed upto those dimensions (code). I think setting shuffle=False argument in IR/TR image loader (code) would make the evaluation not fluctuate by fixing the order of image batches. It seems I somehow deleted the argument while polishing the code for public release. Please set the argument and let me know it fixes the issue. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: ***@***.***>

from vilt.

dandelin commented on August 23, 2024

Hi @byougert

Oops, you got the mail. I deleted the comment right after posted it as I noticed I put shuffle=False in DistributedSampler(image_dset, shuffle=False).

Though after quick investigation, I found the true reason.
It was the precision=16, set in https://github.com/dandelin/ViLT/blob/master/run.py#L51.
After setting precision=32 during evaluation I was able to get stable result.

I guess the score from rank_output is very cluttered so they need larger precision.
Thanks for the report and I will revise the EVAL.md. :)

from vilt.

byougert commented on August 23, 2024

Hi, bro.
Yes, i received your message in my mail but couldn't find the reply in github. hhhh....
Thanks for your reply and nice work.

from vilt.

byougert commented on August 23, 2024

Hi, @dandelin
I'm sorry to say that the result seems still puzzled. Last night, when I changed precision to 32 during evaluation, two similar but NOT SAME results appeared, which showed one was 0.6480(ir)/0.8370(tr) but the other was 0.6460(ir)/0.8370(tr).
Acatlly, seed is exactly fixed to 0. I have no idea what causes the differece. Y_Y

from vilt.

Flickr30k Finetune results does not match the provided checkpoint about vilt HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs