Comments (7)
Simply removing the two calls to .cuda in preprocess-images.py
should work.
from pytorch-vqa.
That does not work. We did try that.
There is an issue with this part of the code which expects CUDA.
torch.utils.data.DataLoader(
dataset,
batch_size=config.preprocess_batch_size,
num_workers=config.data_workers,
shuffle=False,
pin_memory=True,
)
We get an error saying no NVIDIA device found.
So, we tried setting pin_memory=False
. However this did not work as well.
out = net(imgs)
failed since there mismatch in image sizes.
We would like to replicate your results. Is it possible for you to commit 2 new working codes of preprocess-image.py
and train.py
?
from pytorch-vqa.
with pin_memory=True
and after removing .cuda
, this was the error log
Traceback (most recent call last):
File "preprocess-images.py", line 73, in <module>
main()
File "preprocess-images.py", line 62, in main
for ids, imgs in loader:
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 201, in __next__
return self._process_next_batch(batch)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AssertionError: Traceback (most recent call last):
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 62, in _pin_memory_loop
batch = pin_memory_batch(batch)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 123, in pin_memory_batch
return [pin_memory_batch(sample) for sample in batch]
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 123, in <listcomp>
return [pin_memory_batch(sample) for sample in batch]
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 117, in pin_memory_batch
return batch.pin_memory()
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 82, in pin_memory
return type(self)().set_(storage.pin_memory()).view_as(self)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/storage.py", line 83, in pin_memory
allocator = torch.cuda._host_allocator()
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 220, in _host_allocator
_lazy_init()
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 84, in _lazy_init
_check_driver()
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 58, in _check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
from pytorch-vqa.
with pin_memory=False
, this was the error log
Traceback (most recent call last):
File "preprocess-images.py", line 73, in <module>
main()
File "preprocess-images.py", line 64, in main
out = net(imgs)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "preprocess-images.py", line 25, in forward
self.model(x)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torchvision-0.1.9-py3.6.egg/torchvision/models/resnet.py", line 151, in forward
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 53, in forward
return F.linear(input, self.weight, self.bias)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 553, in linear
return torch.addmm(bias, input, weight.t())
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 924, in addmm
return cls._blas(Addmm, args, False)
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 920, in _blas
return cls.apply(*(tensors + (alpha, beta, inplace)))
File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/blas.py", line 26, in forward
matrix1, matrix2, out=output)
RuntimeError: size mismatch, m1: [64 x 8192], m2: [2048 x 1000] at /opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/TH/generic/THTensorMath.c:1293
from pytorch-vqa.
please do note that we have imported the following model for resnet
since your command on line 12 did not work
import torchvision.models.resnet as caffe_resnet
from pytorch-vqa.
The torchvision net is not quite a drop-in replacement. Get the git submodule for the caffe resnet fixed and try the pin_memory=False version. Either way, I don't recommend running this with a CPU-only -- it will take ages to train.
from pytorch-vqa.
from pytorch-vqa.
Related Issues (20)
- when running preprocess_images.py, "size mismatch" occured HOT 1
- Large memory consume HOT 1
- Preprocessed path and vocabulary path issue HOT 3
- Issue with train_loader and val_loader in train.py HOT 2
- ssd create issue HOT 1
- why attention use '+' instead of '*' HOT 1
- Metric computation in training phase HOT 2
- EOFError: Ran out of input when training HOT 5
- concat or sum? HOT 4
- Runtime error with preprocess-images
- About attention showing in the pic
- Mismatch in Computing Accuracy HOT 1
- AttributeError: module ‘torchvision.transforms’ has no attribute ‘Scale’ HOT 3
- Training time HOT 4
- maximum q len HOT 2
- answer normalization HOT 3
- Information regarding training time HOT 1
- Test the model HOT 5
- Working with abstract scenes VQA v1 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-vqa.