Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

About the running time and gpu memory usage about relationnetworks-clevr HOT 6 CLOSED

mesnico commented on May 23, 2024

About the running time and gpu memory usage

from relationnetworks-clevr.

Comments (6)

mesnico commented on May 23, 2024 2

Hi @LMdeLiangMi, I used 2 Tesla K40 GPUs, for a total of 48Gb of VRAM. It took about 30-40 minutes per epoch. I trained for about 350 epochs before reaching a good convergence for the from-pixels version so in the end my training took about 10 days.

from relationnetworks-clevr.

mesnico commented on May 23, 2024 1

@LMdeLiangMi Never heard about Dali, it seems interesting! However, check your CPU utilization. If it is low, I can tell you that, during my experiments, I observed that the disk was very often the bottleneck. Consider moving the CLEVR dataset onto a solid-state drive, if you haven't yet. You should observe a higher utilization of both CPUs and GPUs, together with an overall training speedup.

from relationnetworks-clevr.

LinkToPast1990 commented on May 23, 2024 1

I see. Thanks.

from relationnetworks-clevr.

LinkToPast1990 commented on May 23, 2024

I found that the data loader seems so slow because it does the image processing on CPU. I am trying to write a new loader based on Nvidia Dali.

from relationnetworks-clevr.

LinkToPast1990 commented on May 23, 2024

@mesnico I put the dataset on memory and use Dali, so now it is okay. By the way, could you tell me why label subs 1 in utils.py?
label = (label - 1).squeeze(1)

from relationnetworks-clevr.

mesnico commented on May 23, 2024

@LMdeLiangMi I'm glad you solved the problem.

By the way, could you tell me why label subs 1 in utils.py?

You can see that in the function build_dictionaries() I employed the one-based indexing while constructing the dictionaries, both for the questions and the answers. This is basically because the index 0 is usually reserved for padding (the padding is not necessary for the answers, I did so for consistency with the questions dictionary).
However, while preparing the data for the network, I need to shift back all the answer indexes, otherwise I would have a useless output neuron corresponding to the dummy index 0.

from relationnetworks-clevr.

Recommend Projects

About the running time and gpu memory usage about relationnetworks-clevr HOT 6 CLOSED

Comments (6)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs