Hi, I have trained the RGB models for all 3 splits but I am facing s

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

My pytorch version is 0.2.0_1, the testing command is <div class="snippet-clipboar

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Problems with the test_models.py about tsn-pytorch HOT 8 CLOSED

yjxiong commented on May 26, 2024

Problems with the test_models.py

from tsn-pytorch.

Comments (8)

yjxiong commented on May 26, 2024 3

This CUDA_VISIBLE_DEVICES=6 might be the problem.

We do not use this to specify which GPU to use. Instead, you can issue the following command when you update to the latest version.

python test_models.py ucf101 RGB ../temporal-segment-networks/data/ucf101_rgb_val_split_1.txt models/ucf101_bninception_RGB_1_rgb_checkpoint.pth  --arch BNInception --save_scores tsn_pytorch_rgb_split_1 --gpu 6 -j 1

from tsn-pytorch.

yjxiong commented on May 26, 2024 1

Please try again with the latest version. Have to say that the logic of torch.nn.DataParallel on non-zero GPUs is indeed a pain in the neck.

from tsn-pytorch.

yjxiong commented on May 26, 2024

Hi @utsavgarg , thanks for filing the issue.

I have fixed the first problem in the latest commit.

For the second one, I cannot reproduce the error. Would you please post your testing command and environment settings?

from tsn-pytorch.

utsavgarg commented on May 26, 2024

My pytorch version is 0.2.0_1, the testing command is

CUDA_VISIBLE_DEVICES=6 python test_models.py ucf101 RGB ../temporal-segment-networks/data/ucf101_rgb_val_split_1.txt models/ucf101_bninception_RGB_1_rgb_checkpoint.pth  --arch BNInception --save_scores tsn_pytorch_rgb_split_1

And you can download the checkpoint from https://www.dropbox.com/s/upa0nnrrmi4q36z/ucf101_bninception_RGB_1_rgb_checkpoint.pth?dl=0
to test it

from tsn-pytorch.

utsavgarg commented on May 26, 2024

@yjxiong thanks for the quick fix, but there still seems to some issue

Traceback (most recent call last):
  File "test_models.py", line 129, in <module>
    rst = eval_video((i, data, label))
  File "test_models.py", line 117, in eval_video
    rst = net(input_var).data.cpu().numpy().copy()
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 58, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/home/utsav/tsn/tsn-pytorch/models.py", line 197, in forward
    base_out = self.base_model(input.view((-1, sample_len) + input.size()[-2:]))
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/home/utsav/tsn/tsn-pytorch/tf_model_zoo/bninception/pytorch_load.py", line 48, in forward
    data_dict[op[2]] = getattr(self, op[0])(data_dict[op[-1]])
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 254, in forward
    self.padding, self.dilation, self.groups)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/functional.py", line 52, in conv2d
    return f(input, weight, bias)
RuntimeError: tensors are on different GPUs

from tsn-pytorch.

utsavgarg commented on May 26, 2024

@yjxiong one more thing, the Flow model is taking much longer to complete one epoch compared to the RGB model.
The timings for one epoch are:

RGB - 93.2s
Flow - 793.2s
What do you think is the reason for such a large difference ? Any solutions ?

from tsn-pytorch.

yjxiong commented on May 26, 2024

The flow model reads a lot of images for each video. This makes the data feeding slower than RGB. I have added pin memory in the latest commit. Maybe that could help. Also, try increasing the -j parameter for the flow model to prefetch more.

from tsn-pytorch.

nishanthrachakonda commented on May 26, 2024

https://www.dropbox.com/s/upa0nnrrmi4q36z/ucf101_bninception_RGB_1_rgb_checkpoint.pth?dl=0

It appears this checkpoint is deleted can you provide this checkpoint.

from tsn-pytorch.

Problems with the test_models.py about tsn-pytorch HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs