80GB or 40GB.? For accelerating data reading, which method do you apply? Thank

how about cost time of pretrain stage ? <span class="ema

yes，I got similar result <a href

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

what size of your A100 gpu's memory? about albef HOT 9 CLOSED

salesforce commented on May 16, 2024

what size of your A100 gpu's memory?

from albef.

Comments (9)

LiJunnan1992 commented on May 16, 2024

Hi, I used the 40GB gpu.
Data reading was not a major speed bottleneck, but you could try to resize the image on hard drive because some images could have a high resolution.

from albef.

shoutOutYangJie commented on May 16, 2024

how about cost time of  pretrain stage ?

…

------------------ Original ------------------ From: Junnan Li ***@***.***> Date: Wed,Dec 8,2021 5:45 PM To: salesforce/ALBEF ***@***.***> Cc: shoutOutYangJie ***@***.***>, Author ***@***.***> Subject: Re: [salesforce/ALBEF] what size of your A100 gpu's memory? (Issue#31) Hi, I used the 40GB gpu. Data reading was not a major speed bottleneck, but you could try to resize the image on hard drive because some images could have a high resolution. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

from albef.

LiJunnan1992 commented on May 16, 2024

Using 8 A100 GPUs, it takes 2-3 days with 4M images, and around 7-8 days with 14M images. You could make training faster by reducing the image resolution to 224 and increasing the batch size, the performance would be roughly the same. You can also try some other memory reduction techniques such as zero-redundancy optimizer.

from albef.

shoutOutYangJie commented on May 16, 2024

can i use 8 32gb-v100  gpus to reproduce your training result?  by the way, the code of data preprocessing (filter some pairs) will be released?

…

------------------ Original ------------------ From: Junnan Li ***@***.***> Date: Wed,Dec 8,2021 6:05 PM To: salesforce/ALBEF ***@***.***> Cc: shoutOutYangJie ***@***.***>, Author ***@***.***> Subject: Re: [salesforce/ALBEF] what size of your A100 gpu's memory? (Issue#31) Using 8 A100 GPUs, it takes 2-3 days with 4M images, and around 7-8 days with 14M images. You could make training faster by reducing the image resolution to 224 and increasing the batch size, the performance would be roughly the same. You can also try some other memory reduction techniques such as zero-redundancy optimizer. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

from albef.

LiJunnan1992 commented on May 16, 2024

Yes I think you could.
The pre-training dataset annotation (image paths and text) is released.

from albef.

sunanhe commented on May 16, 2024

Hi, do you get similar performance using 8 V100 GPUs?

can i use 8 32gb-v100 gpus to reproduce your training result? by the way, the code of data preprocessing (filter some pairs) will be released?
…
------------------ Original ------------------ From: Junnan Li @.> Date: Wed,Dec 8,2021 6:05 PM To: salesforce/ALBEF @.> Cc: shoutOutYangJie @.>, Author @.> Subject: Re: [salesforce/ALBEF] what size of your A100 gpu's memory? (Issue#31) Using 8 A100 GPUs, it takes 2-3 days with 4M images, and around 7-8 days with 14M images. You could make training faster by reducing the image resolution to 224 and increasing the batch size, the performance would be roughly the same. You can also try some other memory reduction techniques such as zero-redundancy optimizer. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

from albef.

shoutOutYangJie commented on May 16, 2024

yes，I got similar result

…

------------------ Original ------------------ From: Richard_He ***@***.***> Date: Wed,Jan 12,2022 4:03 PM To: salesforce/ALBEF ***@***.***> Cc: shoutOutYangJie ***@***.***>, State change ***@***.***> Subject: Re: [salesforce/ALBEF] what size of your A100 gpu's memory? (Issue#31) Hi, do you get similar performance using 8 V100 GPUs? can i use 8 32gb-v100  gpus to reproduce your training result?  by the way, the code of data preprocessing (filter some pairs) will be released? … ------------------ Original ------------------ From: Junnan Li @.> Date: Wed,Dec 8,2021 6:05 PM To: salesforce/ALBEF @.> Cc: shoutOutYangJie @.>, Author @.> Subject: Re: [salesforce/ALBEF] what size of your A100 gpu's memory? (Issue#31) Using 8 A100 GPUs, it takes 2-3 days with 4M images, and around 7-8 days with 14M images. You could make training faster by reducing the image resolution to 224 and increasing the batch size, the performance would be roughly the same. You can also try some other memory reduction techniques such as zero-redundancy optimizer. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

from albef.

yangbang18 commented on May 16, 2024

@shoutOutYangJie
Hi, it seems that you had reproduced the results with 8 V100 GPUs.

Did you use the same configurations as in Pretrain.yaml?
How many hours per epoch it took for the training?
Have you tried to reduce the image resolution from 384 to 224?

Looking forward to your reply.

from albef.

shoutOutYangJie commented on May 16, 2024

1. Yes 2. about several hours. 3. No. I can say the work is solid. very good.

…

------------------ 原始邮件 ------------------ 发件人: "salesforce/ALBEF" ***@***.***>; 发送时间: 2022年6月18日(星期六) 晚上11:01 ***@***.***>; ***@***.******@***.***>; 主题: Re: [salesforce/ALBEF] what size of your A100 gpu's memory? (Issue #31) @shoutOutYangJie Hi, it seems that you had reproduced the results with 8 V100 GPUs. Did you use the same configurations as in Pretrain.yaml? How many hours per epoch it took for the training? Have you tried to reduce the image resolution from 384 to 224? Looking forward to your reply. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

from albef.

what size of your A100 gpu's memory? about albef HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs