yjxiong / tsn-pytorch Goto Github PK

Temporal Segment Networks (TSN) in PyTorch

License: BSD 2-Clause "Simplified" License

Python 100.00%

action-recognition deep-learning video-understanding pytorch temporal-segment-networks

tsn-pytorch's Introduction

TSN-Pytorch

We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation for TSN as well as other STOA frameworks for various tasks. The lessons we learned in this repo are incorporated into MMAction to make it bettter. We highly recommend you switch to it. This repo will remain here for historical references.

Note: always use git clone --recursive https://github.com/yjxiong/tsn-pytorch to clone this project. Otherwise you will not be able to use the inception series CNN archs.

This is a reimplementation of temporal segment networks (TSN) in PyTorch. All settings are kept identical to the original caffe implementation.

For optical flow extraction and video list generation, you still need to use the original TSN codebase.

Training

To train a new model, use the main.py script.

The command to reproduce the original TSN experiments of RGB modality on UCF101 can be

python main.py ucf101 RGB <ucf101_rgb_train_list> <ucf101_rgb_val_list> \
   --arch BNInception --num_segments 3 \
   --gd 20 --lr 0.001 --lr_steps 30 60 --epochs 80 \
   -b 128 -j 8 --dropout 0.8 \
   --snapshot_pref ucf101_bninception_

For flow models:

python main.py ucf101 Flow <ucf101_flow_train_list> <ucf101_flow_val_list> \
   --arch BNInception --num_segments 3 \
   --gd 20 --lr 0.001 --lr_steps 190 300 --epochs 340 \
   -b 128 -j 8 --dropout 0.7 \
   --snapshot_pref ucf101_bninception_ --flow_pref flow_

For RGB-diff models:

python main.py ucf101 RGBDiff <ucf101_rgb_train_list> <ucf101_rgb_val_list> \
   --arch BNInception --num_segments 7 \
   --gd 40 --lr 0.001 --lr_steps 80 160 --epochs 180 \
   -b 128 -j 8 --dropout 0.8 \
   --snapshot_pref ucf101_bninception_

Testing

After training, there will checkpoints saved by pytorch, for example ucf101_bninception_rgb_checkpoint.pth.

Use the following command to test its performance in the standard TSN testing protocol:

python test_models.py ucf101 RGB <ucf101_rgb_val_list> ucf101_bninception_rgb_checkpoint.pth \
   --arch BNInception --save_scores <score_file_name>

Or for flow models:

python test_models.py ucf101 Flow <ucf101_rgb_val_list> ucf101_bninception_flow_checkpoint.pth \
   --arch BNInception --save_scores <score_file_name> --flow_pref flow_

tsn-pytorch's People

Contributors

Stargazers

Watchers

Forkers

bityangke baiyancheng20 benjamesbabala peratham weigq utsavgarg soumenms2015 cmhungsteve alexfridman jeffreyyihuang hzou35 qijiezhao mercileesb myownskyw7 xiangzi1992 bili1bili hyzcn zhoubolei resurgo-genetics willdamon yurkovanton guofuzheng liu-zhy blankit newzhx zhangxgu liygcheng yanwang2014 shubhampachori12110095 sherzz eshter yaozy15 lysh ailib zcrwind jonghwanmun gakkiyui0 changebio fesianxu librar127 xiong-qiao queenie88 ruixuejianfei leiyu2 elevanth victorleelk wyhsirius youjiangxu withbrightmoon idon2020 feirenlg zhang-can zhuxinqimac lidaguo fredhuangbia javierlorenzod sunnyxiaohu line290 happyday521 fytrace ewenwan dreamlee0625 haomood ukyoda liya2001 qiuhaohao dracoyu gs-lin tsingzao crosleythomas aoliao12138 danielyou0230 suhaisheng chen849157649 ictjue luchencatherine wangshicr7 clover978 jessisyj finspire13 ivyvideo lynphoenix pplntech lingeo yankai317 itbeyond1230 adiser tangshixiang vinocherish lelan-li lvyijin jpchen2012 yuzhiw yixuanli98 yes7rose dreadlord1984 yunwenhuang luciagan xiaoanshi dream-girl

tsn-pytorch's Issues

No '--gb' option in opts.py

All training commands have the parameter --gb xx but there is no such option in opts.py.

issue about feature extraction

Hi, @yjxiong I want to use your model to extract video feature, my question is following:
1). We should use output of global_average layer as feature, suppose one video has 3 segments, then I can get feature with 3x1024 dimension, no matter rgb modality or flow modality, am I right?
2). Is there any official code to extract feature?
3). How to choose the number of segments when extracting feature, 3 or 25?
Thank you in advance!

About the train and test accuracy

Can the pytoch TSN code achieve the offical accuracy in http://yjxiong.me/others/tsn/.
Is there anything else to pay attention to when i train the flow model?

Thank you in advance!

optical flow model using inceptionresnetv2 to train and experience overfit

Hi, have you try to train tsn based on inceptionresnetv2 model?
I set the batch size: 8 lr: 0.001 segment num: 3 dropout: 0.8, and then train the rgb model and can achieve 86.32% on test.

however, when I using the same setting with 0.7 dropout to train the optical flow model, and get like 99% training acc but maximal 75% on validation. I think the overfitting occurred. Do you have any idea about the difference performance between the rgb and optical flow model using inceptionresnetv2?

Running out of memory

I was trying to run training for UCF-101 RGB split-1 but the model seems to be running out of memory. I am using a GPU with 16 GB VRAM. What is the memory requirement ?

Normalization in RGBDiff model

in main.py, it doesn't do normalization for RGBDiff:

if args.modality != 'RGBDiff':
      normalize = GroupNormalize(input_mean, input_std)
else:
      normalize = IdentityTransform()

Is there any reason to do that?
And I found in test_model.py, you still have normalization for RGBDiff, so I got incorrect testing results when I first tried it. That problem was solved by changing to IdentityTransform. I wonder which one you used for your final results? IdentityTransform or GroupNormalize?

Thank you.

How to combine the results of multiple streams?

Hi, can you provide the script to combine multiple streams for final evaluation?

Network is unreachable

When I run the training script, I encounter the following error:

Downloading: "https://yjxiong.blob.core.windows.net/models/bn_inception-9f5701afb96c8044.pth" to /mnt/lustre/ganweihao/.torch/models/bn_inception-9f5701afb96c8044.p
th

Initializing TSN with base model: BNInception.
TSN Configurations:
input_modality: RGB
num_segments: 3
new_length: 1
consensus_module: avg
dropout_ratio: 0.8

Traceback (most recent call last):
File "main.py", line 301, in
main()
File "main.py", line 35, in main
consensus_type=args.consensus_type, dropout=args.dropout, partial_bn=not args.no_partialbn)
File "/mnt/lustre/ganweihao/codes/tsn-pytorch/models.py", line 39, in init
self._prepare_base_model(base_model)
File "/mnt/lustre/ganweihao/codes/tsn-pytorch/models.py", line 96, in _prepare_base_model
self.base_model = getattr(tf_model_zoo, base_model)()
File "/mnt/lustre/ganweihao/codes/tsn-pytorch/tf_model_zoo/bninception/pytorch_load.py", line 35, in init
self.load_state_dict(torch.utils.model_zoo.load_url(weight_url))
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/site-packages/torch/utils/model_zoo.py", line 56, in load_url
_download_url_to_file(url, cached_file, hash_prefix)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/site-packages/torch/utils/model_zoo.py", line 61, in _download_url_to_file
u = urlopen(url)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/urllib2.py", line 1241, in https_open
context=self._context)
File "/mnt/lustre/ganweihao/anaconda3/envs/python27/lib/python2.7/urllib2.py", line 1198, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 101] Network is unreachable>

Any idea to solve this?
Many thanks.

different segment numbers for validation and testing?

In the main code, the segment number is the same in both training and validation (default=3), but in test_model.py, the segment number is 25. Why is that?

I know the reason for 25 segments, but I don't know why can't we use 25 segments for validation during training?

Thanks.

typeerror: mean received an invalid combination of arguments - got ()

Hi, I met trouble when running the train script:
python main.py ucf101 RGB ../TSN/data/ucf101_rgb_train_split_1.txt ../TSN/data/ucf101_rgb_val_split_1.txt --arch BNInception --num_segments 3 --gd 20 --lr 0.001 --lr_steps 30 60 --epochs 80 -b 128 -j 8 --snapshot_pref ucf101_bninception --b 128

however, an error appears as follows:(including the output of code info)

Initializing TSN with base model: BNInception.
TSN Configurations:
input_modality: RGB
num_segments: 3
new_length: 1
consensus_module: avg
dropout_ratio: 0.5

group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 69 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 69 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
Traceback (most recent call last):
File "main.py", line 301, in
main()
File "main.py", line 124, in main
train(train_loader, model, criterion, optimizer, epoch)
File "main.py", line 166, in train
output = model(input_var)
File "/opt/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/opt/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 61, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 71, in parallel_apply
return parallel_apply(replicas, inputs, kwargs)
File "/opt/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 46, in parallel_apply
raise output
TypeError: mean received an invalid combination of arguments - got (dim=int, keepdim=bool, ), but expected one of:

no arguments
(int dim)

where is the bug?

Hi, pretrained model on kinetics

Hi,

will you release the model pretrained on Kinetics?

lr_mult & decay_mult

What are these two variables mean?

how to generate the bn_inception.yaml

i have a question that how to generate the *.yaml ,is handwriting? or generate this through other file.
thx : )

Problems with the test_models.py

Hi,

I have trained the RGB models for all 3 splits but I am facing some issues with the test_models.py program.

Line 48, while calling the model, two arguments are passed( rnn=args.rnn, rnn_mem_size=args.rnn_mem_size ) which are not valid.
If I remove these arguments and run, I am getting an list index out of range error on line 123.

Here is the error stack trace

model epoch 80 best prec@1: 83.4522855911
Traceback (most recent call last):
  File "test_models.py", line 123, in <module>
    rst = eval_video((i, data, label))
  File "test_models.py", line 111, in eval_video
    rst = net(input_var).data.cpu().numpy().copy()
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 56, in forward
    inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 67, in scatter
    return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 30, in scatter_kwargs
    inputs = scatter(inputs, target_gpus, dim)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 25, in scatter
    return scatter_map(inputs)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 18, in scatter_map
    return tuple(zip(*map(scatter_map, obj)))
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/scatter_gather.py", line 15, in scatter_map
    return Scatter(target_gpus, dim=dim)(obj)
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/_functions.py", line 59, in forward
    streams = [_get_stream(device) for device in self.target_gpus]
  File "/export/home/utsav/.local/lib/python2.7/site-packages/torch/nn/parallel/_functions.py", line 85, in _get_stream
    if _streams[device] is None:
IndexError: list index out of range

Loss doesn't decrease when training optical flow model based on BNInception

Thanks for your great job! But when I train TSN flow model on myself datasets(There are about 25000 training examples), the training loss and test loss cannot be reduced anymore when it decreased to about 1.8. After that, the training loss and test loss will stabilise at about 1.8, even though I have tried to decrease learning rate and increase training loop.

My training strategies are the same as what you write down on "readme.md".

python main.py ucf101 Flow <ucf101_flow_train_list> <ucf101_flow_val_list> \
   --arch BNInception --num_segments 3 \
   --gd 20 --lr 0.001 --lr_steps 190 300 --epochs 340 \
   -b 128 -j 8 --dropout 0.7 \
   --snapshot_pref ucf101_bninception_ --flow_pref flow_

I don't know why the training loss will get stuck in 1.8, and top1 accuracy of training set is only about 60%.

Does there any other methods that I can try to fix the proplem? Will Adam be more efficiency than SGD?

when i training RGB-diff models,There are some mistakes.

Freezing BatchNorm2D except the first one.
Test: [0/60] Time 97.470 (97.470) Loss 0.1188 (0.1188) Prec@1 95.312 (95.312) Prec@5 100.000 (100.000)
Test: [20/60] Time 6.395 (6.560) Loss 3.9672 (0.9311) Prec@1 43.750 (80.208) Prec@5 70.312 (95.461)
Test: [40/60] Time 15.587 (5.824) Loss 0.8059 (0.9010) Prec@1 84.375 (81.174) Prec@5 92.188 (95.351)
Testing Results: Prec@1 82.236 Prec@5 95.691 Loss 0.83703
Freezing BatchNorm2D except the first one.
Epoch: [55][0/150], lr: 0.00100 Time 52.754 (52.754) Data 52.346 (52.346) Loss 0.1943 (0.1943) Prec@1 93.750 (93.750) Prec@5 100.000 (100.000)
Traceback (most recent call last):
File "main.py", line 301, in
main()
File "main.py", line 124, in main
train(train_loader, model, criterion, optimizer, epoch)
File "main.py", line 157, in train
for i, (input, target) in enumerate(train_loader):
File "/home/kong.ye/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 275, in next
idx, batch = self._get_batch()
File "/home/kong.ye/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 254, in _get_batch
return self.data_queue.get()
File "/usr/lib/python2.7/Queue.py", line 168, in get
self.not_empty.wait()
File "/usr/lib/python2.7/threading.py", line 340, in wait
waiter.acquire()
File "/home/kong.ye/.local/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 175, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 32569) is killed by signal: Killed.

what should I do，thanks

Questions about transforms

Thank you for your outstanding work.
What does Stack operation in transforms.py do?

Training strategy of Inception V3

If we change the base network to Inception V3，will training strategy change？such as lr step

RuntimeError: cuda runtime error (2) : out of memory

While testing the RGBDiff model using the command
python test_models.py ucf101 RGBDiff /media/sda/nandan/data/ucf101_rgb_val_split_1.txt ucf101_bninception__rgbdiff_checkpoint.pth.tar --arch BNInception --save_scores SCORE_UCF101_1_RGBDIFF --workers=2
I'm getting this error
Traceback (most recent call last):
File "test_models.py", line 130, in
rst = eval_video((i, data, label))
File "test_models.py", line 117, in eval_video
rst = net(input_var).data.cpu().numpy().copy()
File "/home/nandan/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/nandan/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 73, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/nandan/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 83, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/nandan/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply
raise output
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518238409320/work/torch/lib/THC/generic/THCStorage.cu:58

I'm using two K40 GPU with each global memory capacity 4742MiB.

dataloader runtime errors

python main.py ucf101 RGB ucf101_trainlist01new.txt ucf101_testlist01new.txt --gpus 1 --arch BNInception --num_segments 3 --gd 20 --lr 0.001 --lr_steps 30 60 --epochs 80 -b 128 -j 8 --dropout 0.8

Initializing TSN with base model: BNInception.
TSN Configurations:
input_modality: RGB
num_segments: 3
new_length: 1
consensus_module: avg
dropout_ratio: 0.8

/home/ytan/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py:360: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
own_state[name].copy_(param)
group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 69 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 69 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
Traceback (most recent call last):
File "main.py", line 301, in
main()
File "main.py", line 124, in main
train(train_loader, model, criterion, optimizer, epoch)
File "main.py", line 157, in train
for i, (input, target) in enumerate(train_loader):
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 201, in next
return self._process_next_batch(batch)
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 40, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 109, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 109, in
return [default_collate(samples) for samples in transposed]
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 89, in default_collate
storage = batch[0].storage()._new_shared(numel)
File "/home/ytan/miniconda3/lib/python3.6/site-packages/torch/storage.py", line 113, in _new_shared
return cls._new_using_fd(size)
RuntimeError: unable to write to file </torch_476_615100490> at /opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/TH/THAllocator.c:271

Any suggestions?

confused about fusion of two streams

Hi, @yjxiong .Happy New Year!
I have some questions. Suppose I have two models, A and B. Test accuracy of two streams of A are both better than B, but the fused accuracy of A is worse than B. Is this right?
If it's true, how do I choose the stream model. I mean I don't know when to stop training, because lower RGB accuracy and Flow accuracy may cause higher fused accuracy.

About generating RGB diff input images

@yjxiong hi,
For generating the RGB diff images as the input to train the flow branch, you said we just directly subtract two consecutive frames. Does that mean, for example, I want to generate all the input images (frames, optical flow, rgb diff) from dense_flow code, I added the following in the dense_flow_gpu.cpp:

image_diff = capture_image - prev_image;
imencode(".jpg", image_diff, str_img);

Is this the correct way to generate the diff image? If yes, then the image_diff can range from -255 to 255. The visualization image is most black with some edge highlighted region.

If NO, do we need to set any bound (like optical flow one) to normalize the image_diff (-255~~255) to (0~~255) in dense_flow_gpu.cpp code? Like the following:
#define CAST(v, L, H) ((v) > (H) ? 255 : (v) < (L) ? 0 : cvRound(255*((v) - (L))/((H)-(L))))
for (int i = 0; i < image_diff.rows; ++i) {
for (int j = 0; j < image_diff.cols; ++j) {
for (int k = 0; k < 3; k++) {
float t = image_diff.at(i,j)[k];
//bound = 255
image_diff.at(i,j)[k] = CAST(t, -bound, bound);
}
}
}
#undef CAST
In this case after normalization, we will have visualization image with most gray region with edge highlighted region, just like the optical flow one.

After storing all the diff images, we run the RGB diff script to train the RGB diff branch, right?

Many thanks.

Need pre-trained kinetics weights for Pytorch

Hello everyone,
Has anyone successfully converted kinetics pre-trained weights into Pytorch TSN?

Runtime Error 59

I am trying to run the TSN training with the same specifications of the RGB script in the README.md file. And I end up with this error:

Epoch: [0][0/75], lr: 0.00100 Time 39.106 (39.106) Data 2.112 (2.112) Loss 4.6157 (4.6157) Prec@1 0.781 (0.781) Prec@5 3.125 (3.125)
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/THC/generated/../generic/THCTensorSort.cu line=153 error=59 : device-side assert triggered
Traceback (most recent call last):
File "main.py", line 301, in
main()
File "main.py", line 124, in main
train(train_loader, model, criterion, optimizer, epoch)
File "main.py", line 170, in train
prec1, prec5 = accuracy(output.data, target, topk=(1,5))
File "main.py", line 289, in accuracy
_, pred = output.topk(maxk, 1, True, True)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/THC/generated/../generic/THCTensorSort.cu:153
terminate called without an active exception
Aborted (core dumped)

I was checking the Pytorch guide but there is nothing there about it. I am using Pytorch 0.2, but the code doesn't specify which version to use.

can not test

why i can not use the file ucf101_bninception__rgb_checkpoint.pth.tar

Traceback (most recent call last):
File "test_models.py", line 53, in
checkpoint = torch.load(args.weights)
File "/home/xxx/anaconda3/envs/py35/lib/python3.5/site-packages/torch/serialization.py", line 265, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'ucf101_bninception2_rgb_checkpoint.pth'

but i can not decompression ucf101_bninception__rgb_checkpoint.pth.tar

About the evaluation

Hi, @yjxiong
I found you use 'accuracy' evaluation in training but use 'precision' evaluation in testing, am I right? If I'm right, please tell me why?

Parameter_load

If I don't want to load the BN-Inception pre-train weight, and set the random weight on BN-Inception. what should I do, just annotation the load_state_dict function? thx~

About the consensus_type

Hi, @yjxiong

You used the consensus_type of avg in the paper, and l also want to know the effect of the type of topk, however l have not see the code in the program. so could you provide the script about the topk in the forward and backward part? Or do you have any idea about it?

Thanks for your kindly help!

`class SegmentConsensus(torch.autograd.Function):

def __init__(self, consensus_type, dim=1):
    self.consensus_type = consensus_type
    self.dim = dim
    self.shape = None

def forward(self, input_tensor):
    self.shape = input_tensor.size()
    if self.consensus_type == 'avg':
        output = input_tensor.mean(dim=self.dim, keepdim=True)
    elif self.consensus_type == 'identity':
        output = input_tensor
    else:
        output = None

    return output

def backward(self, grad_output):
    if self.consensus_type == 'avg':
        grad_in = grad_output.expand(self.shape) / float(self.shape[self.dim])
    elif self.consensus_type == 'identity':
        grad_in = grad_output
    else:
        grad_in = None

    return grad_in`

fusion

Spatial network and temporal network are trained separatly? The fusion of the networks are implemented just in testing stage?

Why the epochs for RGB modality is so large?

The original paper uses 4500 iterations in total.

Dropout wasn't working in the code?

Hi, I noticed that the author defined a dropout layer here, but it seems that this layer wasn't used during the forward of the network. Could you please explain about the dropout implementation in more detail?

when i load a bninception model in tsn？ some wrong happen.

i load a bninception model---https://github.com/Cadene/pretrained-models.pytorch/tree/master/pretrainedmodels/models
It has been pre trained.

when i load a bninception model.
After dozens of epoch, loss has been stuck in a large number.
i don't know how to solve it ,please help me ,or Or put forward a bit of thought。
thanks！！！！！

Inverting flow values while flipping

Why do you invert the values for optical flow when the image is flipped ? Done in line 57 and 121 in transforms.py

testing process met error.

I run with the command:
CUDA_VISIBLE_DEVICES=2,3 python test_models.py ucf101 Flow data/ucf101_flow_val_split_1.txt ucf101_flow_bninception_flow_checkpoint.pth.tar --arch BNInception --save_score=flow_bninception --flow_pref flow_ --workers=1

The output content is as follows:

video 160 done, total 161/3783, average 3.05683402393 sec/video
video 161 done, total 162/3783, average 3.04620183986 sec/video
video 162 done, total 163/3783, average 3.05641095039 sec/video
video 163 done, total 164/3783, average 3.04159315766 sec/video
video 164 done, total 165/3783, average 3.06657971469 sec/video
video 165 done, total 166/3783, average 3.05109782535 sec/video
video 166 done, total 167/3783, average 3.06426371786 sec/video
video 167 done, total 168/3783, average 3.04766791633 sec/video
video 168 done, total 169/3783, average 3.06247991077 sec/video
video 169 done, total 170/3783, average 3.04995046503 sec/video
Traceback (most recent call last):
  File "test_models.py", line 125, in <module>
    for i, (data, label) in data_gen:
  File "/opt/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 187, in __next__
    return self._process_next_batch(batch)
  File "/opt/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
IOError: Traceback (most recent call last):
  File "/opt/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 40, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/S2/MI/zqj/video_classification/tsn-pytorch/dataset.py", line 99, in __getitem__
    return self.get(record, segment_indices)
  File "/S2/MI/zqj/video_classification/tsn-pytorch/dataset.py", line 107, in get
    seg_imgs = self._load_image(record.path, p)
  File "/S2/MI/zqj/video_classification/tsn-pytorch/dataset.py", line 51, in _load_image
    x_img = Image.open(os.path.join(directory, self.image_tmpl.format('x', idx))).convert('L')
  File "/opt/anaconda2/lib/python2.7/site-packages/PIL/Image.py", line 2477, in open
    fp = builtins.open(filename, "rb")
IOError: [Errno 2] No such file or directory: '../data/ucf101/flows_tvl1/v_Skijet_g04_c03/flow_x_00000.jpg'

THCudaCheckWarn FAIL file=/pytorch/torch/lib/THC/THCStream.cpp line=50 error=29 : driver shutting down

num_segments when testing

Hi, @yjxiong I saw the following code in test_models.py:

net = TSN(num_class, 1, args.modality,
base_model=args.arch,
consensus_type=args.crop_fusion_type,
dropout=args.dropout)

Does it mean we set num_segmenrts=1 when testing and why??
please help me! Thank you!

Pretrained Models and Performance

Hi,
I am wondering if there are any pretrained pytorch models that could be downloaded.
In addition, could you pls let us know the performance of this tsn-pytorch on UCF 101 / HMDB 51/ kinetics?
Thanks a lot!

TSN followed by RNN

Hi,
Would you plan to release the RNN version of TSN? Like `Long Term Recurrent Convolutioanl Neural Network" by J Donahue ?
Moreover, why torch.nn.DataParallel on non-zero GPUs always raise errors? Is there a smart way to fix this?

Thanks!

Seems to be a typo in transforms.py

Hi, thanks for sharing the code.

It seems to be a type in
https://github.com/yjxiong/tsn-pytorch/blob/master/transforms.py#L135
where "875" seems to be ".875".

Thanks.

resnet

Will the results become better by using the resnet101 as pretrained model?

how can I use another dataset in tsn-pytorch?

I want to use different dataset.
In tsn(not this pytorch version), you wrote wiki description using another dataset.
but i don't know in this version.....

Run model on untrimmed video

Hey,

I want to perform inference on untrimmed video. How do I do it?

worse performance in pytorch

Hi, Thanks for sharing this nice implementation. I notice that the performance of pytorch is somehow worse than the original caffe implementation used in the ECCV paper. Does anyone know the reason?

Optical flow training tricks

Hi, Xiong:

sorry to impose again, My experiment of optical flow only receives an average accuracy of 84.4% on the split 1( result is given by the test script). My training setting is (batch_size=128, init_lr=0.001, lr_step=(190,300 epochs)), which is different from the caffe implement version of TSN (batch_size=24, init_lr=0.005,lr_step=(10000,16000 iterations)).

can you please help figure out which setting can bring the proposed result on your paper?
If both of them can't, so what is the best setting?

training the RGBDiff model

I assume we can use the same inputs to train a RGBDiff model. Am I correct?
However, I met this error message:

Initializing TSN with base model: resnet101.
TSN Configurations:
    input_modality:     RGBDiff
    num_segments:       5
    new_length:         5
    consensus_module:   avg
    dropout_ratio:      0.8

Converting the ImageNet model to RGB+Diff init model
Traceback (most recent call last):
  File "main.py", line 354, in <module>
    main()
  File "main.py", line 48, in main
    consensus_type=args.consensus_type, dropout=args.dropout, partial_bn=not args.no_partialbn)
  File "/home/tsn-pytorch/models.py", line 49, in __init__
    self.base_model = self._construct_diff_model(self.base_model)
  File "/home/tsn-pytorch/models.py", line 264, in _construct_diff_model
    first_conv_idx = filter(lambda x: isinstance(modules[x], nn.Conv2d), list(range(len(modules))))[0]
TypeError: 'filter' object is not subscriptable

Could you help me figure out what it is?
Thank you so much

Meet troubles when using multi-GPUs

Thanks for your nice sharing!@yjxiong

According to your readme.md， i use the " --gpus 4 5 6 7 " to train tsn on the 4,5,6,7 GPUs, but the logs show that "RuntimeError: all tensors must be on devices[0]".

I also tried with the CUDA_VISIBLE_DEVICES, but the logs show that "TypeError: mean received an invalid combination of arguments - got (dim=int, keepdim=bool, ), but expected one of: * no arguments,* (int dim)".

Could you please show how to train tsn with multi-GPUs? Thank you very much!

GroupRandomHorizontalFlip return img_group or ret?

Hi, I find in the transform.m file.
///////////////////
def call(self, img_group, is_flow=False):
v = random.random()
print(v)
if v < 0.5:
ret = [img.transpose(Image.FLIP_LEFT_RIGHT) for img in img_group]
if self.is_flow:
for i in range(0, len(ret), 2):
ret[i] = ImageOps.invert(ret[i]) # invert flow pixel values when flipping
return img_group
//////////////////

The function return img_group, should it return ret?

randint size

It seems you forgot setting the size of randint to be the num_segments in the line 70 in file dataset.py.

trained models in pytorch

Would you share the trained models in pytorch?

How to reproduce the experiment that combines RGB and optical flow?

Hello, thanks for your code sharing - a great work!

Still, I have a question - we can train models of RGB/RGB-diff/flow modality separately as you introduced in the README.md, but if we want to reproduce the experiment that combines RGB and optical flow, how can we achieve this? Should I write some my own code to jointly infer from two models of RGB and optical flow modality?

Thank you in advance!

Runtime Error in RGBDiff Experiment

Hello,

I am running the RGBDiff experiment my own dataset. I am sure the file list is made properly since I have tested them in original TSN repo(Caffe implementation).
I am using the following command -
python main.py sdha-actionness RGBDiff dataset/sdha/train_actionness_rgb.txt dataset/sdha/val_actionness_rgb.txt --arch BNInception --num_segments 7 --gd 40 --lr 0.001 --lr_steps 80 160 --epochs 180 -b 32 -j 8 --dropout 0.8 --snapshot_pref sdha_actionness_bninception_

I am getting the following error -

Initializing TSN with base model: BNInception.
TSN Configurations:
    input_modality:     RGBDiff
    num_segments:       7
    new_length:         5
    consensus_module:   avg
    dropout_ratio:      0.8

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py:360: UserWarning: src is not broadcastable to dst, but they have the same number of elements.  Falling back to deprecated pointwise behavior.
  own_state[name].copy_(param)
Converting the ImageNet model to RGB+Diff init model
Traceback (most recent call last):
  File "main.py", line 316, in <module>
    main()
  File "main.py", line 37, in main
    consensus_type=args.consensus_type, dropout=args.dropout, partial_bn=not args.no_partialbn)
  File "/var/www/tsn-pytorch/models.py", line 49, in __init__
    self.base_model = self._construct_diff_model(self.base_model)
  File "/var/www/tsn-pytorch/models.py", line 268, in _construct_diff_model
    new_kernels = params[0].data.mean(dim=1).expand(new_kernel_size).contiguous()
RuntimeError: The expanded size of the tensor (15) must match the existing size (64) at non-singleton dimension 1. at /pytorch/torch/lib/TH/generic/THTensor.c:308

Kindly help.