kenshohara / 3d-resnets-pytorch Goto Github PK
View Code? Open in Web Editor NEW3D ResNets for Action Recognition (CVPR 2018)
License: MIT License
3D ResNets for Action Recognition (CVPR 2018)
License: MIT License
Dear @kenshohara,
Thank you very much for your fantastic repository. Do you have any pretrained model on Sports1M?
Thank you for your wonderful work.
Just read the paper, it is noted that each clip contains 16 frames. I read two other papers in which the author claims that 32 frame input would be better, have you tried 32 frames input? If you trained such models, can you please release the pretrained models?
No downsample branch parameters in the above keys.
hi dear
when i tried to run command:
python main.py --root_path ~/data --video_path ucf101_videos/jpg --annotation_path ucf101_01.json
--result_path results --dataset ucf101 --n_classes 400 --n_finetune_classes 101
--pretrain_path models/resnet-34-kinetics-cpu.pth --ft_begin_index 4
--model resnet --model_depth 34 --resnet_shortcut A --batch_size 128 --n_threads 4 --checkpoint 5
i have also set no_cuda=True in opts as i m running it on cpu only**
I m getting this error:
Traceback (most recent call last):
File "HAR_3D/main.py", line 47, in
model, parameters = generate_model(opt)
File "/home/chandni/HAR_3D/model.py", line 178, in generate_model
model.load_state_dict(pretrain['state_dict'])
File "/home/chandni/anaconda3/envs/har/lib/python3.6/site-packages/torch/nn/modules/module.py", line 522, in load_state_dict
.format(name))
KeyError: 'unexpected key "module.conv1.weight" in state_dict'
any idea to resolve this issue
Hi kenshohara !
I want to evaluate this model on my own dataset without modifying too much the code. I have skimmed through the code and then i think i should modify the file dataset.py
. However i dont know how to begin to modify it. Would you please give me some suggestions ? Thanks
Thanks for your sharing! @kenshohara
when i use the --resume, I met an error like this :
KeyError: 'missing keys in state_dict: "set(['module.layer2.0.downsample.1.running_var', 'module.layer3.0.downsample.1.running_var', 'module.layer2.0.downsample.1.running_mean', 'module.layer4.0.downsample.1.running_mean', 'module.layer4.0.downsample.1.running_var', 'module.layer3.0.downsample.1.weight', 'module.layer2.0.downsample.0.weight', 'module.layer3.0.downsample.0.weight', 'module.layer4.0.downsample.0.weight', 'module.layer4.0.downsample.1.bias', 'module.layer4.0.downsample.1.weight', 'module.layer3.0.downsample.1.bias', 'module.layer2.0.downsample.1.weight', 'module.layer2.0.downsample.1.bias', 'module.layer3.0.downsample.1.running_mean'])"'
Could you please tell how to debug it ?
For me it is not quite clear how to structure the video_directory with these datasets for kinetics. Should it be video_directory/{train,val,test}/jpg, so that it can train and validate at the same time ? or is there another folder structure i should adhere to ?
I have a small two classes dataset, which have 1000 videos for each class. I want to use finetune your pretrained models, but it seems to overfit my dataset. How can i figure it out? Enlarge my dataset?
Thank you for providing a nice code!
I tested the pretrained model "resnext-101-64f-kinetics-ucf101_split1.pth" using UCF 101.
I got 93.99% video level accuracy.
However, the computational speed is really slow (roughly a few hours) because of data loading.
Although I'm using HDD not SSD, data loading speed is much slower than my expectation.
Especially, (4n+1) times speed is significantly slow (4 is the number of threads).
I added my log file as follows:
[1/197] Time 335.936 (335.936) Data 333.094 (333.094)
[2/197] Time 1.754 (168.845) Data 0.000 (166.547)
[3/197] Time 1.758 (113.149) Data 0.000 (111.032)
[4/197] Time 1.769 (85.304) Data 0.000 (83.274)
[5/197] Time 298.968 (128.037) Data 297.199 (126.059)
[6/197] Time 1.750 (106.989) Data 0.000 (105.049)
[7/197] Time 1.760 (91.956) Data 0.000 (90.042)
[8/197] Time 1.757 (80.681) Data 0.000 (78.787)
[9/197] Time 280.848 (102.922) Data 279.067 (101.040)
[10/197] Time 1.766 (92.807) Data 0.000 (90.936)
[11/197] Time 1.754 (84.529) Data 0.000 (82.669)
[12/197] Time 1.760 (77.632) Data 0.000 (75.780)
[13/197] Time 290.565 (94.011) Data 288.792 (92.166)
[14/197] Time 1.750 (87.421) Data 0.000 (85.582)
[15/197] Time 1.763 (81.711) Data 0.000 (79.877)
[16/197] Time 1.756 (76.713) Data 0.000 (74.885)
[17/197] Time 303.138 (90.032) Data 301.384 (88.208)
[18/197] Time 4.009 (85.253) Data 2.253 (83.433)
[19/197] Time 3.780 (80.965) Data 2.027 (79.148)
[20/197] Time 1.759 (77.005) Data 0.000 (75.191) ...
Is the speed is normal or something wrong?
If it is something wrong, could you kindly let me know how to fix it?
Hi Kenshohara, I'm not sure how to evaluate the HMDB51 test set on the model which is fine-tuned. And I wonder how to loaded HMDB51 pretrained weight on your model. I've tried but it report some dictionary errors.
@kenshohara Thanks for your wonderful works and sorry for bothering you!
I see that you have released c3d-sports1m-kinetics.t7( 608 MB), did it has the same architecture and get the same performance as the C3D(https://arxiv.org/abs/1412.0767) ?
Besides, i can not find any results about this model on any dataset in your provided papers. Could you please show the accuarcy on some datasets(ucf101,hmdb51)?
Finally, could you please share the code that finetuning the c3d-sports1m-kinetics.t7 on the UCF101?
Thank you in advance!
on running code for resnext101(fine-tuning resnext 101 kinetics pretrained model ) i got clip accuracy around 86 %.
val.json file is created during testing as per ur code, now i want to get video accuracy, so first i need to to test by setting no_val and no_train =true and Test=true in opts.py.
i ran code with above options, it created val.json
on evaluating val.json its giving accuracy =0.28
do in neeed to change any other option in opts.py?
can please mention options for testing only using fine tuned model.
Hi
Since I cannot find, what training accuracy you achieved on the Kinetics training set, I am not sure if the accuracy I obtained is high enough?
I obtain around 30% training accuracy, however the loss is similar to that reported in the paper (around 3), so I am not sure if the accuracy is also similiar ?
Are the inputs of all pretrained models normalized by mean? I find you add the mean of kinetics, while there is only the mean of of activitynet several weeks ago, so could tell us which mean is used to train the pretrained models? When I finetune the pretrained models to my custom dataset, need I compute the mean of the custom dataset?
KeyError: 'unexpected key "module.features.conv0.weight" in state_dict'
I'm using densenet-201-kinetics.pth
file with the densenet.py
file from models folder.
net = densenet.densenet201(sample_size=64, sample_duration=30, num_classes=400)
pretrained_weights = torch.load(pretrained_path)
net.load_state_dict(pretrained_weights['state_dict'])
Hi@kenshohara ,
in the opts.py ,whether I can change temporal duration of inputs in parser.add_argument('--sample_duration', default=16, type=int, help='Temporal duration of inputs'),like 32 frames,64 frames,etc? have you take the similar experiments? I really appreciate for your reply, Thanks.
I am training Resnet with depth 34 on the kinetics dataset, however the training procedure is not improving anything. How long does it take till the model starts improving ? I have attached a screenshot; currently I am at epoch 34 but the loss is still 5.99 and is not decreasing, and accuracy is very volatile
The Resnet and ResNeXt models (I haven't checked others) seem to be trying to initialize the weights using kaiming's initialization method in his Resnet paper using a normal distribution. However, by comparing the codes with the details in the paper as well as pytorch's own implementation (yes, pytorch has implemented kaiming's initializations), the Conv3d weight initialization seems to miss calculating the size of the third dimension in the kernel, i.e. instead of using
if isinstance(m, nn.Conv3d):
# Kernel is 3D but here only considers the time and row
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
we can use pytorch's implementation directly:
nn.init.kaiming_normal(m.weight, mode='fan_out')
whose fan in/out factor is calculated by
num_input_fmaps = tensor.size(1)
num_output_fmaps = tensor.size(0)
receptive_field_size = 1
if tensor.dim() > 2:
receptive_field_size = tensor[0][0].numel() # which should be kernel_size[0]*kernel_size[1]*kernel_size[2]
fan_in = num_input_fmaps * receptive_field_size
fan_out = num_output_fmaps * receptive_field_size
I've done a quick test (train 200 epochs once) on the mini-kinetics dataset and by fixing the weights initialization the accuracy seems to improve.
Let me know if it makes sense.
Hi, I finetune the pretrained resnet-34-kinetics model on hmdb51, with the following command:
python main.py --root_path ~/data --video_path hmdb51/jpg --annotation_path hmdb51_1.json --result_path resnet34_finetune_hmdb51_results --dataset hmdb51 --n_classes 400 --n_finetune_classes 51 --pretrain_path models/resnet-34-kinetics.pth --ft_begin_index 4 --model resnet --model_depth 34 --resnet_shortcut A --batch_size 128 --n_threads 4 --checkpoint 5
When I check the performance on train and val sets, accuracy(train) is kind of high - around 0.8, while accuracy(val) is always around 0.5. Is this correct? It seems the training overfits into the train set?
Please mention hardware and software requirements,brief steps to follow to use this code.
@kenshohara Thanks for your wonderful works and sorry for bothering you.
I try to repeat your work, but there are some problems when I try it. I did not change your code except for the necessary place, Could you please help me to fix this problem? Thank you so much.
The problem is showed as blow:
Traceback (most recent call last):
File "main.py", line 123, in
opt, train_logger, train_batch_logger)
File "/home/deep_ww/3D-ResNets-PyTorch-master/train.py", line 29, in train_epoch
outputs = model(inputs)
File "/home/deep_ww/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/deep_ww/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 66, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/deep_ww/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/deep_ww/3D-ResNets-PyTorch-master/resnet.py", line 164, in forward
x = self.fc(x)
File "/home/deep_ww/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/deep_ww/anaconda2/lib/python2.7/site-packages/torch/nn/modules/linear.py", line 55, in forward
return F.linear(input, self.weight, self.bias)
File "/home/deep_ww/anaconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 835, in linear
return torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch at /opt/conda/conda-bld/pytorch_1512946747676/work/torch/lib/THC/generic/THCTensorMathBlas.cu:243
Hi, great work! Thank you.
Could you please tell me how long it took for you to fine-tune resnet-18(pretrained on kinetics) on UCF101 to get the reported accuracy (~84%)? Also, was there any specific hyperparameter setting I need to change while fine-tuning except freezing conv layers?
-Ananth
you said downloading datasets using official crawler codes, can you show me the offiical code or some guidence to download the dataset
Did you normalize the inputs to [0,1] in the training phases of the pretrained models?
Dear Kensho
Thanks for your github, which is very useful for the community! I wonder if any pre-trained ActivityNet model could be downloaded?
Thanks & Bests
OM
flake8 testing of https://github.com/kenshohara/3D-ResNets-PyTorch on Python 2.7.13
$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics
./activitynet.py:22:16: F821 undefined name 'accimage'
return accimage.Image(path)
^
./kinetics.py:22:16: F821 undefined name 'accimage'
return accimage.Image(path)
^
Hi,when I try to apply the pre-train model(resnext-101-64f-kinetics.pth)on the validation set from Kinetic dataset, the accuracy turns out to be very low(much like random prediction). I have check the loader and did not modify the code. I am wondering how does the pre-train model come from. Did you directly train Pytorch code from zero or transfer the weights from Torch-version pre-train model? Thx!
Hi dear
can u please tell about how to choose batch size?
My system has one GPU(8 GB memory) and System memory is 64 GB.
when i tried to run with default batch size of 128 , it gave run time error: out of memory
Then i reduced it to 20, it started working.
But i want to know what should be appropriate batch size according to my system configuration, little about how to set appropriate batch size and whether it will affect accuracy?
I would be very much thankful for your suggestion on this....
Hi dear
Need help....on running main.py ,everything is going well till dataset loading as shown below:
model generated
dataset loading [0/9537]
dataset loading [1000/9537]
dataset loading [2000/9537]
dataset loading [3000/9537]
dataset loading [4000/9537]
dataset loading [5000/9537]
dataset loading [6000/9537]
dataset loading [7000/9537]
dataset loading [8000/9537]
dataset loading [9000/9537]
dataset loading [0/3783]
dataset loading [1000/3783]
dataset loading [2000/3783]
dataset loading [3000/3783]
run
error occured here:
train at epoch 1
Traceback (most recent call last):
File "/media/psrana/New Volume/chandni/HAR_3D_TU/main.py", line 139, in
train_logger, train_batch_logger)
File "/media/psrana/New Volume/chandni/HAR_3D_TU/train.py", line 22, in train_epoch
for i, (inputs, targets) in enumerate(data_loader):
File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 417, in iter
return DataLoaderIter(self)
File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 242, in init
self._put_indices()
File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 290, in _put_indices
indices = next(self.sample_iter, None)
File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 119, in iter
for idx in self.sampler:
File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 50, in iter
return iter(torch.randperm(len(self.data_source)).long())
RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247
what could be the
Hi,first ,you did a very nice work,and I am very interested in it.
But now ,I can't download kinetics database from the official crawler because I can't access YouTube(No vpn).
So,could you please offer the database in somewhere?such as:Google Drive?
Best wishes for you.
sincerely ckjiao(LakyTT)
First , it is a great job!
I use the kinetics data to train a resnet-34-kinetics model. Every action have 50 MP4s. I train the model just like this:
python3 main.py --root_path ~/3D-ResNets-PyTorch --video_path kineticsJPG --annotation_path kineticsjson/kinetics.json --result_path results --dataset kinetics --model resnet --model_depth 34 --resnet_shortcut A --pretrain_path models/resnet-34-kinetics.pth --n_classes 400 --batch_size 30 --n_threads 4 --checkpoint 5
It success to create a model "save_30.pth".
My train.log file looks like this:
epoch loss acc lr
21 5.05777625165984 0.050443081117927745 0.1
22 5.051979898375048 0.05023333857689686 0.1
23 5.037916659033671 0.05075769492947407 0.1
24 5.020105431997609 0.05448062503277227 0.1
25 5.006561265645096 0.05689266425462745 0.1
26 4.986441563412287 0.05615856536101935 0.1
27 4.982790104450624 0.05909496093545173 0.1
28 4.962981188055365 0.05883278275916313 0.1
29 4.957985667213246 0.06077290126369881 0.1
30 4.930375250401841 0.06276545540349221 0.1
Is it OK?
When i use "video-classification-3d" to check the model ,
Command is:
python3 main.py --input ./input --video_root ./videos --output ./output.json --model pathto/save_30.pth --mode score
I find the result is poor.
Why?
Is the train.log right?
Do i train the model sufficiently?
Hi dear
Can u please upload your val.json for ucf101 split-1?
Hi,
Nice work! I have a question about your results on UCF101 split 1. I've evaluated your pretrained weight of "resnext-101-kinetics-ucf101_split1.pth" on UCF101 split 1 and got the accuracy of ~85.99. I'm wondering if it is the right accuracy or not. Would you please provide the accuracies of the pretrained models?
I downloaded the network ResNet-101 pretrained on Kinetics, and fine-tuned on UCF101 following the example script. However, I can only get 82.5 by averaging the three splits. In the paper, the authors reported 88.9. Any suggestion?
Such as UCF101 or HMDB51 datasets, labeling the entire image.
Then,The Google AVA dataset has action annotation and bounding box annotation.
Is it possible to modify the network with bounding box regression (object detection such as yolo)
It can make the network more accurate by learning human the action.
Hi,
I have seen you uploaded pretrained models using 64 frames for ucf101 and hmdb51.
Is there a chance you can upload a 64f model pretrained using kinetics only?
I would like to compare results between 16f and 64f, but in order to make a proper comparison I would rather use the pretrained model only on kinetics as well.
Thanks for the great work!
I am running the command of the readme
python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json
--result_path results --dataset kinetics --model resnet
--model_depth 34 --n_classes 400 --batch_size 128 --n_threads 4 --checkpoint 5
(with my options), but somehow if I check in my results folder, where opts.json and stuff is saved, there is no pth file saved, even after 100 epochs of training. Do I have to specify the checkpoint path too when calling main.py ?
Hi,
Thanks for releasing this awesome repository. However I cannot find the test set son file (matching the youtube_id to a label) on the Kinetics dataset webpage.
Could you show where to find it?
Thanks
According to the 181 line of kinetics.py and the 69 line of main.py, because of the the existence of randomize.parameter(),I find the 16 frames in one clip have different spatial transform ,Will it have any impact?in my opinion , the 16 frames in one clip have the uniform spatial transform may be more reasonable.
I am trying to use resnet-34 (cpu version) for both classification and feature extraction. Here is the error:
(tensorflow) mariankyoussef@elecsim:~/video-classification-3d-cnn-pytorch$ python main.py --input ./input --video_root ./home/mariankyoussef/UCF101_videos --output ./output.json --model ./models/resnet-34-kinetics-cpu.pth --mode score --no_cuda
loading model ./models/resnet-34-kinetics-cpu.pth
Traceback (most recent call last):
File "main.py", line 24, in
model_data = torch.load(opt.model)
File "/home/mariankyoussef/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/serialization.py", line 267, in load
return _load(f, map_location, pickle_module)
File "/home/mariankyoussef/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/serialization.py", line 410, in _load
magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, '<'.
I didn't try to fix the error yet, but I am wondering if the first couple of lines have the right settings?
Hi, nice work first.
I have finetune my model on hmdb51, then I got a checkpoint save_200.pth. Then I try to run the following script to evaluate on validation.
python main.py --root_path ~/Research/datasets --video_path hmdb51/jpg --annotation_path hmdb51/testTrainMulti/hmdb51_1.json --result_path ~/Research/3D-ResNets-PyTorch/results/hmdb51 --dataset hmdb51 --n_finetune_classes 51 --n_classes 51 --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --batch_size 32 --n_threads 4 --test --test_subset val --pretrain_path ~/Research/3D-ResNets-PyTorch/results/hmdb51/save_200.pth --no_train --no_val
After that I got a file called val.json. Then I run evaluate the script you provide in utils/eval_hmdb51.py.
hmdb = HMDBclassification('hmdb51_1.json', 'val.json',verbose=True, top_k=1) hmdb.evaluate()
Then I got
[INIT] Loaded annotations from validation subset.
Number of ground truth instances: 1530
Number of predictions: 15290
[RESULTS] Performance on ActivityNet untrimmed video classification task.
Error@1: 0.9784313725490196
It seems that something wrong with the prediction numbers. Can you tell me how you run the evaluation script. Thx :)
Did you try Charades dataset? It seems need more temporal information to classify.
I have used the code to fine-tune on hmdb51, I use the following command.
python main.py --root_path ~/Research/datasets --video_path hmdb51/jpg --annotation_path hmdb51/testTrainMulti/hmdb51_1.json --result_path ~/Research/3D-ResNets-PyTorch/results/hmdb51 --dataset hmdb51 --n_classes 400 --n_finetune_classes 51 --pretrain_path ~/Research/3D-ResNets-PyTorch/pretrain/resnext-101-kinetics.pth --ft_begin_index 5 --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --batch_size 32 --n_threads 4 --checkpoint 5 --n_epochs 20
After training, I got 'save_20.pth' weight, then I run the following code to continue training from 21st epochs.
python main.py --root_path ~/Research/datasets --video_path hmdb51/jpg --annotation_path hmdb51/testTrainMulti/hmdb51_2.json --result_path ~/Research/3D-ResNets-PyTorch/results/hmdb51 --dataset hmdb51 --n_classes 51 --resume_path ~/Research/3D-ResNets-PyTorch/results/hmdb51/save_20.pth --model resnext --model_depth 101 --resnet_shortcut B --resnext_cardinality 32 --batch_size 32 --n_threads 4 --checkpoint 5 --n_epochs 20
I got an error:
Traceback (most recent call last): File "main.py", line 131, in <module> optimizer.load_state_dict(checkpoint['optimizer']) File "/home/ole/anaconda3/lib/python3.6/site-packages/torch/optim/optimizer.py", line 87, in load_state_dict raise ValueError("loaded state dict has a different number of ValueError: loaded state dict has a different number of parameter groups
How can I fix this?
Hello!
I got 85.2% when fine tuning resnet-50 on UCF-101 split-1 instead of 89%, my settings are:
python main.py --root_path ~/big/3D-ResNets-PyTorch --video_path ~/big/UCF-101_jpg --annotation_path utils/ucf101_01.json --result_path ucf101_results --dataset ucf101 --n_classes 400 --n_finetune_classes 101 --pretrain_path model_weights/resnet-50-kinetics.pth --ft_begin_index 4 --model resnet --model_depth 50 --resnet_shortcut B --batch_size 128 --n_threads 4 --checkpoint 5 --learning_rate 0.001
Hi, where i can get this section "annotation_dir_path includes brush_hair_test_split1.txt". I don't find it in the dataset website.
I am running the following command.
CUDA_VISIBLE_DEVICES=2 python main.py --root_path --video_path ucf101_jpg --annotation_path ucfTrainTestlist/ucf101_01.json --result_path results --dataset ucf101 --n_classes 400 --n_finetune_classes 101 --pretrain_path models/resnet-34-kinetics.pth --model resnet --model_depth 34 --resnet_shortcut A --batch_size 128 --n_threads 4 --no_train --no_val --test
I followed the steps in the readme to set up the ucf101 jpg frames and annotations. I printed out the shape of x after each layer in the forward pass and I get the following before the size mismatch error occurs.
input: (128L, 3L, 16L, 112L, 112L)
self.conv1 output: (128L, 64L, 16L, 56L, 56L)
self.bn1 output: (128L, 64L, 16L, 56L, 56L)
self.relu output: (128L, 64L, 16L, 56L, 56L)
self.maxpool output: (128L, 64L, 8L, 28L, 28L)
self.layer1 output: (128L, 64L, 8L, 28L, 28L)
self.layer2 output: (128L, 128L, 4L, 14L, 14L)
self.layer3 output: (128L, 256L, 2L, 7L, 7L)
self.layer4 output: (128L, 512L, 1L, 4L, 4L)
self.avgpool output: (128L, 512L, 1L, 2L, 2L)
x.view(x.size(0), -1) output: (128L, 2048L)
The issue is that self.fc is (512, 101). As a temporary hack, I changed the stride of self.avgpool from 1 to 2, but otherwise I am not sure where the error is.
In the experiment of fine-tuning conv5_x and fc layers of a pretrained model on UCF-101. I got a size mismatch error. I have checked the shape of UCF-101 input data is (128L, 3L, 16L, 112L, 112L).
Complete error message:
Traceback (most recent call last): File "main.py", line 140, in <module> train_logger, train_batch_logger) File "/home/magic/yc/ActionRecognition/3D-ResNets-PyTorch-master/train.py", line 34, in train_epoch outputs = model(inputs) File "/home/magic/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/home/magic/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 60, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/magic/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 70, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/magic/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply raise output RuntimeError: size mismatch at /pytorch/torch/lib/THC/generic/THCTensorMathBlas.cu:243
My command:
python main.py --root_path ./data --video_path UCF101/jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --n_classes 400 --n_finetune_classes 101 --pretrain_path models/resnet-34-kinetics.pth --ft_begin_index 4 --model resnet --model_depth 34 --resnet_shortcut A --batch_size 128 --n_threads 4 --checkpoint 5
dataset loading [0/3570]
dataset loading [1000/3570]
dataset loading [2000/3570]
dataset loading [3000/3570]
dataset loading [0/1530]
dataset loading [1000/1530]
run
train at epoch 1
Epoch: [1][1/112] Time 4.807 (4.807) Data 2.836 (2.836) Loss 3.9053 (3.9053) Acc 0.000 (0.000)
/pytorch/torch/lib/THCUNN/ClassNLLCriterion.cu:101: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [26,0,0] Assertion t >= 0 && t < n_classes failed.
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generated/../THCReduceAll.cuh line=339 error=59 : device-side assert triggered
Traceback (most recent call last):
File "main.py", line 137, in
train_logger, train_batch_logger)
File "/media/ole/Document/Ubuntu/Research/3D-ResNets-PyTorch/train.py", line 31, in train_epoch
acc = calculate_accuracy(outputs, targets)
File "/media/ole/Document/Ubuntu/Research/3D-ResNets-PyTorch/utils.py", line 58, in calculate_accuracy
n_correct_elems = correct.sum().data[0]
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generated/../THCReduceAll.cuh:339
terminate called after throwing an instance of 'std::runtime_error'
what(): cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCStorage.c:184
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.