GithubHelp home page GithubHelp logo

run without CUDA about pytorch-vqa HOT 7 CLOSED

cyanogenoid avatar cyanogenoid commented on June 26, 2024
run without CUDA

from pytorch-vqa.

Comments (7)

Cyanogenoid avatar Cyanogenoid commented on June 26, 2024

Simply removing the two calls to .cuda in preprocess-images.py should work.

from pytorch-vqa.

varunnrao avatar varunnrao commented on June 26, 2024

That does not work. We did try that.
There is an issue with this part of the code which expects CUDA.

torch.utils.data.DataLoader(
        dataset,
        batch_size=config.preprocess_batch_size,
        num_workers=config.data_workers,
        shuffle=False,
        pin_memory=True,
    )

We get an error saying no NVIDIA device found.
So, we tried setting pin_memory=False. However this did not work as well.

out = net(imgs) failed since there mismatch in image sizes.

We would like to replicate your results. Is it possible for you to commit 2 new working codes of preprocess-image.py and train.py?

from pytorch-vqa.

varunnrao avatar varunnrao commented on June 26, 2024

with pin_memory=True and after removing .cuda, this was the error log


Traceback (most recent call last):
  File "preprocess-images.py", line 73, in <module>
    main()
  File "preprocess-images.py", line 62, in main
    for ids, imgs in loader:
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 201, in __next__
    return self._process_next_batch(batch)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
AssertionError: Traceback (most recent call last):
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 62, in _pin_memory_loop
    batch = pin_memory_batch(batch)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 123, in pin_memory_batch
    return [pin_memory_batch(sample) for sample in batch]
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 123, in <listcomp>
    return [pin_memory_batch(sample) for sample in batch]
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 117, in pin_memory_batch
    return batch.pin_memory()
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 82, in pin_memory
    return type(self)().set_(storage.pin_memory()).view_as(self)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/storage.py", line 83, in pin_memory
    allocator = torch.cuda._host_allocator()
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 220, in _host_allocator
    _lazy_init()
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 84, in _lazy_init
    _check_driver()
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 58, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

from pytorch-vqa.

varunnrao avatar varunnrao commented on June 26, 2024

with pin_memory=False, this was the error log

Traceback (most recent call last):
  File "preprocess-images.py", line 73, in <module>
    main()
  File "preprocess-images.py", line 64, in main
    out = net(imgs)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "preprocess-images.py", line 25, in forward
    self.model(x)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torchvision-0.1.9-py3.6.egg/torchvision/models/resnet.py", line 151, in forward
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 53, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 553, in linear
    return torch.addmm(bias, input, weight.t())
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 924, in addmm
    return cls._blas(Addmm, args, False)
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 920, in _blas
    return cls.apply(*(tensors + (alpha, beta, inplace)))
  File "/home/vqaproject2018/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/blas.py", line 26, in forward
    matrix1, matrix2, out=output)
RuntimeError: size mismatch, m1: [64 x 8192], m2: [2048 x 1000] at /opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/TH/generic/THTensorMath.c:1293

from pytorch-vqa.

varunnrao avatar varunnrao commented on June 26, 2024

please do note that we have imported the following model for resnet since your command on line 12 did not work
import torchvision.models.resnet as caffe_resnet

from pytorch-vqa.

Cyanogenoid avatar Cyanogenoid commented on June 26, 2024

The torchvision net is not quite a drop-in replacement. Get the git submodule for the caffe resnet fixed and try the pin_memory=False version. Either way, I don't recommend running this with a CPU-only -- it will take ages to train.

from pytorch-vqa.

varunnrao avatar varunnrao commented on June 26, 2024

from pytorch-vqa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.