@hdjsjyl @Goingqs Using batchsize 32, I got the following error when run train.py, I use two 1080TI and 32G memory, can you help me out please?
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58
I understand that it is a gpu memory issue, but I use the same setting as yours, why can't I train with batch 32? I change batch size to 16, and now it works for the first iteration, but failed next, got the following error:
root@bfffba56f59a:/app# python3 train.py
Initializing weights...
Loading Dataset...
Training SSD on WiderFace
/usr/local/lib/python3.5/dist-packages/torch/autograd/functions/tensor.py:447: UserWarning: mask is not broadcastable to self, but they have the same number of elements. Falling back to deprecated pointwise behavior.
return tensor.masked_fill(mask, value)
front and back Timer: 6.264386892318726 sec.
iter 0 || Loss: 59.1287 ||
Loss conf: 30.951845169067383 Loss loc: 13.884753227233887
Loss head conf: 20.91655158996582 Loss head loc: 7.667636871337891
lr: 0.001
Saving state, iter: 0
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "train.py", line 240, in
train()
File "train.py", line 185, in train
out = net(images)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/data_parallel.py", line 73, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/data_parallel.py", line 83, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply
raise output
File "/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/parallel_apply.py", line 42, in _worker
output = module(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/app/pyramid.py", line 211, in forward
c3 = self.layer2(c2) #S8
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/app/pyramid.py", line 84, in forward
out = F.relu(self.bn2(self.conv2(out)),inplace=True)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py", line 90, in conv2d
return f(input, weight, bias)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58