GithubHelp home page GithubHelp logo

rstarcnn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rstarcnn's Issues

Got lower AP with the VGG16 reference model

Hi all,
When I run the code on PASCAL VOC 2012 Action dataset with the trained models, the output is similar to that in the paper. But, the results, which are generated with the method of training with the reference model, are lower than that in the paper. For example, the AP of phoning is 0.138 which is much lower than that in the paper.
The scripts for training are as follows.

./tools/train_net.py --gpu 0 --solver models/VGG16_RstarCNN/solver.prototxt --weights reference_models/VGG16.v2.caffemodel
./tools/test_net.py --gpu 0 --def models/VGG16_RstarCNN/test.prototxt --net output/default/voc_2012_train/vgg16_fast_rcnn_joint_train_iter_40000.caffemodel

How to create mat file of VOC selective search regions

Hi Georgia, I reviewed your code and find that the selective search region proposals are load from cached matlab files directly (155 line, in RStarCNN_ROOT/lib/dataset/pascal_voc.py). And as far as I know, the code of selective search (like in here: https://github.com/sergeyk/selective_search_ijcv_with_python) doesn't give a matlab file output. So could you give me the code that generated matlab files of ss region because I want to use RStarCNN in another dataset, thanks!

Floating point exception

Hello,I'm sorry if I distubed you! A new problem I met is when training a model according to myself dataset, 'Floating point exception' is occured !! Why?!
Thank you!

problem of running the code on Berkeley Attributes of People Dataset

I am trying to run the code on the Berkeley Attributes of People Dataset (BAPD) . After downloading the BAPD dataset and pre-trained models from the provided links, I modified the tools/test_net.py by replacing imdb = datasets.pascal_voc('val','2012') by imdb = datasets.attr_bpad('BAPD'). Then I run the following command:
./tools/test_net.py --gpu 0 --def models/VGG16_RstarCNN_attributes/test.prototxt --net models/VGG16_RstarCNN_attributes/bpad_rstarcnn_train.caffemodel and got the following error:
line 216, in im_detect box_deltas = blobs_out['bbox_pred'] KeyError: 'bbox_pred'

The reason is that the VGG16_RstarCNN_attributes/test.prototxt does not have the layer bbox_pred,
this is a little weird because bbox_pred layers exists in other .prototxt files, such as VGG16_RstarCNN/test.prototxt and VGG16_RstarCNN_stanford40/test.prototxt.

Can anyone give me a hint on this issue please? Thanks in advance.

Error Check failed: error == cudaSuccess (35 vs. 0) CUDA driver version is insufficient for CUDA runtime version

Hi,
I am trying to run the code to test the RCNN classifier on the VGG16 reference model, but I get the error:

F0408 15:16:28.126595 23712 common.cpp:141] Check failed: error == cudaSuccess (35 vs. 0) CUDA driver version is insufficient for CUDA runtime version
*** Check failure stack trace: ***
Aborted (core dumped)

Could you help me with the issue? Is it a problem with the CUDA version? If so, which version is needed to make the code run?

How can I plot the loss?

Hi, thanks for your wonderful work.

I've reproduced your work and I found that you have changed the snapshot writing part.
Then how can I plot the two loss?

From my deep heart, I know I just need to change the default tools/extra/plot_training_log.py.example file.

Can you share your script?

Is is possible to train the rstar model using multiple GPUs?

I'm learning the theory and code of the rstar. But I find the training times takes a little longer. Is is possible to train the action recognition rstar model using multiple GPUs? Is the code support this parallelization? If so, how can i modify the code?

Check failed: error == cudaSuccess (2 vs. 0) out of memory

Hi,
I train a classifier with :
./tools/train_net.py --gpu 0 --solver models/VGG16_RstarCNN/solver.prototxt --weights reference_models/VGG16.v2.caffemodel
and get an error.
F1007 19:40:40.235829 26752 syncedmem.cpp:66] Check failed: error == cudaSuccess (2 vs. 0) out of memory *** Check failure stack trace: *** Aborted (core dumped)
Where can I modify the 'batch_size'?

Calculate the accuracy on test dataset

Hi gkioxari, I reproduced your code on UCF101 dataset, and finally I wrote all results to text file (use function adapted from _write_voc_results_file in lib/fast-rcnn/test.py). The line in those files looks like this:

index id score

My question is how to calculate the accuracy on whole dataset using score in those files? I viewed all issues in this project but not found answer yet, so could you please give me some hints? Thanks.

P.S: Maybe you can create a gitter in this project so we can discuss more efficiently.

How can I train RstarCNN with ZF pretrained model ?

Hi,
Because I couldn't train RstarCNN with VGG16 model (since I have a GPU sufficiency problem: I use GeForce GT630 ), I'm trying doing the train phase with ZF model (which seems need less GPU memory). But, I still have some problem in changing the parameters of the layers in the "ZF.caffemodel" !!
Is there any one who has tried this before? can any one help me please ;;;

IOError: [Errno 2] No such file or directory:/VOCdevkit2012/results/VOC2012/Action/comp10_action_val_jumping.txt'

Dear gkioxari,
I'm sorry to bother you.but I meet some trouble.When I try , it occurs some error:

Writing jumping VOC results file
Traceback (most recent call last):
File "./tools/test_net.py", line 75, in
test_net(net, imdb)
File "/home/customer/xubing/RstarCNN/tools/../lib/fast_rcnn/test.py", line 285, in test_net
imdb._write_voc_results_file(all_boxes)
File "/home/customer/xubing/RstarCNN/tools/../lib/datasets/pascal_voc.py", line 247, in _write_voc_results_file
with open(filename, 'wt') as f:
IOError: [Errno 2] No such file or directory: '/home/customer/xubing/RstarCNN/tools/../lib/datasets/../../data/VOCdevkit2012/results/VOC2012/Action/comp10_action_val_jumping.txt'

but there is no sub file 'results' in the file 'VOCdevkit2012',and could you tell me how to get this file or some steps I did wrong before?
could you help me ? Thanks for your help.

Data Augmentation

I would like to ask how Data Augmentation is performed in the case of the Baseline RCNN that only uses groundtruth ROIs as Primary Regions.

More specifically within the paper, you mention that:

Rather than limiting training to the ground-truth person
locations, we use all regions that overlap more than 0.5 with
a ground-truth box. This condition serves as a form of data
augmentation. For every primary region, we randomly select
N regions from the set of candidate secondary regions.
N is a function of the GPU memory limit (we use a Nvidia
K40 GPU) and the batch size.
We fine-tune our network starting with a model trained
on ImageNet-1K for the image classification task. We tie
the weights of the fully connected primary and secondary
layers (fc6, fc7), but not for the final scoring models. We set
the learning rate to 0.0001, the batch size to 30 and consider
2 images per batch. We pick N = 10 and train for 10K
iterations. Larger learning rates prevented fine-tuning from
converging.

Thus for the case of the simple RCNN baseline that uses only primary regions and no secondary regions, this means that each batch contains 2 images and 30 ROIs for the ROI-Pooling layer.

Assuming the aforementioned assumption holds, in case the two images contain only 1 primary region each, with what do you fill the rest of the batch (as there should be 28 positions left empty) ?

Since the number of primary regions is not fixed per image, do you enforce the number of data augmentation samples to be balanced per class somehow?

Would it be possible to share the results you achieve without using data augmentation?

..build_release./lib/libcaffe.so: undefined.build_release reference to /lib`cv:/libcaffe.so:imread: (cvundefined ::reference toString `constcv:&,:imread int(cv).':

If anyone has this error when building the caffe-fast-rcnn
it can be fixed by adding to the Makefile.config
at the line with LIBRARIES += glog gflags protobuf leveldb snappy \ lmdb boost_system hdf5_hl hdf5 m \ opencv_core opencv_highgui opencv_imgproc
add "opencv_imgcodecs" so it becomes:
LIBRARIES += glog gflags protobuf leveldb snappy \ lmdb boost_system hdf5_hl hdf5 m \ opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs

make clean
make -j8 && make pycaffe

reference: BVLC/caffe#2348

What is the Back Propagation formula when fuse the pri and sec stream?

Dear gkioxari

I am sorry to bother you since I want to rewrite R Star CNN on the MatConvNet. I do not know What is the Back Propagation formula when fuse the pri and sec stream at following two points.

First:
layer { name: "sum_scores" type: "Sum" bottom: "cls_score" bottom: "mil_context_cls_score" top: "sum_cls_score" }

Second:
how to fuse the following two branches
layer { name: "context_roi_pool5" type: "ROIPooling" bottom: "conv5_3" bottom: "secondary_rois" top: "context_pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } }

layer { name: "roi_pool5" type: "ROIPooling" bottom: "conv5_3" bottom: "rois" top: "pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } }

Thanks!
Francis

Pose estimation with R-CNN code

I am Sameh NEILI, a PhD student researcher in the field of artificial intelligence. I am very interested with your results with deep-learning. For this, I would reproduce with my own data, but when I looked at this link code (https://github.com/gkioxari/RstarCNN), I found the 'action RCNN' part only.
Is it possible to give me the code for the part of the pose estimation?
I will be very grateful for your help.

Technical Problem of Testing the Stanford40 Dataset

Hi, everyone. I tested the pascal-voc 2012 datasets, RstarCNN works great! But there's a little problem when testing on Stanford40.

  • In the ./tools/test_net.py :
    (line 13)from fast_rcnn.test import test_net --> from fast_rcnn.test_stanford40 import test_net
    (line 16)import datasets.stanford40 was added.
    (line 78)imdb = datasets.pascal_voc('val','2012')--> imdb = datasets.stanford40('test')
  • In the ./lib/datasets/stanford40.py :
    (line 295)datasets.pascal_voc('trainval', '2012')--> d = datasets.stanford40('test')

After that, I typed
./tools/test_net.py --gpu 0 --def models/VGG16_RstarCNN_stanford40/test.prototxt --net ./trained_models/stanford40_rstarcnn_train.caffemodel into the terminal, and finally:
Network initialization done and Memory required for data: 117424480 appeared.
(This is satisfying output)

Then comes the problem, the terminal says:
File "./tools/test_net.py", line 78, in <module>
imdb = datasets.stanford40('test')
TypeError: 'module' object is not callable

I tried to figure out the problem myself, so I add the following code to ./tools/test_net.py:
print(type(datasets.pascal_voc)), the print result turn out to be: <type 'type'>
print(type(datasets.stanford40)), the print result turn out to be: <type 'module'>

In ./lib/datasets/"pascal_voc.py" & "stanford40.py" are written in almost the same manner, I don't know what cause the problem. If I missed something please correct me.

question about 'other' class in pascal voc ?

@gkioxari , I got a question about the processing of 'other' class in pascal voc

  1. in this line, 'other' class is the last element in the class list.
  2. while in this line, i find that it ignore the 0-index class.
    Should it ignore the 'other' class? but acording to 1, index of 'other' class is not 0 ?
    Any comments ?

Loss jumps up and down

Hi Georgia and all, I train RstarCNN on my dataset, and get loss up and down(see below). I investigated on this for a quite long time, but cannot figure it out. Any Suggestions?

Running parameters:

  1. System info: K80, Ubuntu 14.04, Cuda 7.5
  2. using default cofiguration in lib/fast-rcnn/config.py
  3. batch size: 256
  4. I have shuffled training and testing sample list files to ensure that samples having the same true label distribute randomly.

Solver prototxt file:

train_net: "models/VGG16_RstarCNN/ucf101/train.prototxt"
base_lr: 0.00001
lr_policy: "step"
gamma: 0.1
stepsize: 10000
display: 100
average_loss: 100
max_iter: 100000
# iter_size: 1
momentum: 0.9
weight_decay: 0.0005
# We disable standard caffe solver snapshotting and implement our own snapshot
# function
snapshot: 0
# We still use the snapshot prefix, though
snapshot_prefix: "random_train_ucf101_lr_000001_10000_with_annotation"
#debug_info: true

Output logs(a small part of snapshot since output is similar):

I1011 06:34:31.291971  5398 solver.cpp:242] Iteration 111700, loss = 1.02373                                                                                                       [1235/1235]
I1011 06:34:31.292038  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.823978 (* 1 = 0.823978 loss)
I1011 06:34:31.292053  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.199757 (* 1 = 0.199757 loss)
I1011 06:34:31.292067  5398 solver.cpp:571] Iteration 111700, lr = 1e-16
I1011 06:34:58.248217  5398 solver.cpp:242] Iteration 111800, loss = 4.14667
I1011 06:34:58.248303  5398 solver.cpp:258]     Train net output #0: loss_bbox = 2.60533 (* 1 = 2.60533 loss)
I1011 06:34:58.248317  5398 solver.cpp:258]     Train net output #1: loss_cls = 1.54133 (* 1 = 1.54133 loss)
I1011 06:34:58.248332  5398 solver.cpp:571] Iteration 111800, lr = 1e-16
I1011 06:35:23.487130  5398 solver.cpp:242] Iteration 111900, loss = 0.180264
I1011 06:35:23.487207  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.163791 (* 1 = 0.163791 loss)
I1011 06:35:23.487224  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.0164736 (* 1 = 0.0164736 loss)
I1011 06:35:23.487246  5398 solver.cpp:571] Iteration 111900, lr = 1e-16
speed: 0.302s / iter
I1011 06:35:48.810129  5398 solver.cpp:242] Iteration 112000, loss = 2.9106
I1011 06:35:48.810212  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.931796 (* 1 = 0.931796 loss)
I1011 06:35:48.810227  5398 solver.cpp:258]     Train net output #1: loss_cls = 1.9788 (* 1 = 1.9788 loss)
I1011 06:35:48.810242  5398 solver.cpp:571] Iteration 112000, lr = 1e-16
I1011 06:36:16.708410  5398 solver.cpp:242] Iteration 112100, loss = 1.21812
I1011 06:36:16.708484  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.906153 (* 1 = 0.906153 loss)
I1011 06:36:16.708498  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.311964 (* 1 = 0.311964 loss)
I1011 06:36:16.708513  5398 solver.cpp:571] Iteration 112100, lr = 1e-16
I1011 06:36:41.948743  5398 solver.cpp:242] Iteration 112200, loss = 0.705015
I1011 06:36:41.948823  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.674399 (* 1 = 0.674399 loss)
I1011 06:36:41.948838  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.0306158 (* 1 = 0.0306158 loss)
I1011 06:36:41.948853  5398 solver.cpp:571] Iteration 112200, lr = 1e-16
I1011 06:37:07.298220  5398 solver.cpp:242] Iteration 112300, loss = 2.5305
I1011 06:37:07.298300  5398 solver.cpp:258]     Train net output #0: loss_bbox = 2.05376 (* 1 = 2.05376 loss)
I1011 06:37:07.298319  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.476741 (* 1 = 0.476741 loss)
I1011 06:37:07.298336  5398 solver.cpp:571] Iteration 112300, lr = 1e-16
I1011 06:37:32.898839  5398 solver.cpp:242] Iteration 112400, loss = 1.79959
I1011 06:37:32.898924  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.0928969 (* 1 = 0.0928969 loss)
I1011 06:37:32.898941  5398 solver.cpp:258]     Train net output #1: loss_cls = 1.70669 (* 1 = 1.70669 loss)
I1011 06:37:32.898955  5398 solver.cpp:571] Iteration 112400, lr = 1e-16
I1011 06:37:58.258740  5398 solver.cpp:242] Iteration 112500, loss = 0.0715626
I1011 06:37:58.258813  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.0594433 (* 1 = 0.0594433 loss)
I1011 06:37:58.258831  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.0121193 (* 1 = 0.0121193 loss)
I1011 06:37:58.258846  5398 solver.cpp:571] Iteration 112500, lr = 1e-16
I1011 06:38:26.558539  5398 solver.cpp:242] Iteration 112600, loss = 0.562871
I1011 06:38:26.558622  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.562777 (* 1 = 0.562777 loss)
I1011 06:38:26.558637  5398 solver.cpp:258]     Train net output #1: loss_cls = 9.33708e-05 (* 1 = 9.33708e-05 loss)
I1011 06:38:26.558652  5398 solver.cpp:571] Iteration 112600, lr = 1e-16
I1011 06:38:52.470046  5398 solver.cpp:242] Iteration 112700, loss = 1.65518
I1011 06:38:52.470135  5398 solver.cpp:258]     Train net output #0: loss_bbox = 1.07787 (* 1 = 1.07787 loss)
I1011 06:38:52.470152  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.577311 (* 1 = 0.577311 loss)
I1011 06:38:52.470170  5398 solver.cpp:571] Iteration 112700, lr = 1e-16
I1011 06:39:17.642257  5398 solver.cpp:242] Iteration 112800, loss = 0.535988
I1011 06:39:17.642333  5398 solver.cpp:258]     Train net output #0: loss_bbox = 0.365876 (* 1 = 0.365876 loss)
I1011 06:39:17.642346  5398 solver.cpp:258]     Train net output #1: loss_cls = 0.170112 (* 1 = 0.170112 loss)

Python error: corrupted size vs. prev_size... when loaded the caffemodel

hi!
The issue I met is that when loaded the layer conv3_1, the terminal captured the error as fallows:
*** Error in python': corrupted size vs. prev_size: 0x00007f28395179e0 *** Wishing your answer! Thank you very much! ...... I0606 15:25:45.454169 15772 net.cpp:477] pool2 <- conv2_2 I0606 15:25:45.454175 15772 net.cpp:433] pool2 -> pool2 I0606 15:25:45.454303 15772 net.cpp:155] Setting up pool2 I0606 15:25:45.454313 15772 net.cpp:163] Top shape: 1 128 56 56 (401408) I0606 15:25:45.454316 15772 layer_factory.hpp:76] Creating layer conv3_1 I0606 15:25:45.454322 15772 net.cpp:110] Creating Layer conv3_1 I0606 15:25:45.454326 15772 net.cpp:477] conv3_1 <- pool2 I0606 15:25:45.454331 15772 net.cpp:433] conv3_1 -> conv3_1 I0606 15:25:45.454519 15772 net.cpp:155] Setting up conv3_1 I0606 15:25:45.454526 15772 net.cpp:163] Top shape: 1 256 56 56 (802816) *** Error in python': corrupted size vs. prev_size: 0x00007f28395179e0 ***

Do I need openCV to run and make ?

I had following error. Is this because of I don't have openCV ?

src/caffe/layers/cudnn_conv_layer.cu(125): error: argument of type "float *" is incompatible with parameter of type "size_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(125): error: too few arguments in function call
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(140): error: argument of type "const void *" is incompatible with parameter of type "cudnnConvolutionBwdDataAlgo_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(140): error: argument of type "float *" is incompatible with parameter of type "size_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(140): error: too few arguments in function call
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=float]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(81): error: argument of type "cudnnAddMode_t" is incompatible with parameter of type "const void *"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Forward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double
]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(81): error: argument of type "const void *" is incompatible with parameter of type "cudnnTensorDescriptor_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Forward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double
]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(81): error: argument of type "const void *" is incompatible with parameter of type "cudnnTensorDescriptor_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Forward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double
]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(81): error: too many arguments in function call
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Forward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double
]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(125): error: argument of type "const void *" is incompatible with parameter of type "cudnnConvolutionBwdFilterAlgo_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(125): error: argument of type "double *" is incompatible with parameter of type "size_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(125): error: too few arguments in function call
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(140): error: argument of type "const void *" is incompatible with parameter of type "cudnnConvolutionBwdDataAlgo_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(140): error: argument of type "double *" is incompatible with parameter of type "size_t"
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" 
(157): here

src/caffe/layers/cudnn_conv_layer.cu(140): error: too few arguments in function call
          detected during instantiation of "void caffe::CuDNNConvolutionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *,
 std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" 
(157): here

20 errors detected in the compilation of "/tmp/tmpxft_0000426b_00000000-16_cudnn_conv_layer.compute_50.cpp1.ii".

hi

can you tell me how to solve this explicit, Thanks

some confused with RstarCNN

Hi
I want to know in train.prototxt (under models/VGG16_RstarCNN/) , What is the meaning of “bbox_targets”、“bbox_loss_weights” in first data layer
Thanks

Got lower AP using same model?

Recently, I run the test code using your caffe model on two datasets: pascal voc 2012 validation set and stanford40 test set for action recongnition.
I find that the APs on the two dataset i got are both lower than the AP reported in your paper: Contextual Action Recognition with R*CNN.
I use same images, same selective search rois, groundtruth roi and caffe model you provide.
Are the models you provide in the github the same as the models you use in the paper?
what's the AP you get using the model you provided in public? : )

MIL layer

Hi @gkioxari , I'm trying to rewrite the code to Pytorch however I'm having difficulties understanding the MIL.

What is the intuition behind it?

As for my understanding, get the class scores from context_fc7 output (4096->11) then from n number of rois, get the maximum on each context e.g.

[1,2,3,4,5,6,7,8,9,10]
[10, 9,8,7,6,5,4,3,2,1]
'=
[10,9,8,7,6,6,7,8,9,10]

Facing issues with compiling and running the code

Recently I 've been trying to compile and run your code.

I encountered several problems by experimenting with multiple Ubuntu (14.04 and 16.04) and CUDA versions (7, 7.5 and 8). I also tried to run this code with two different graphics cards ( Geforce GTX 960 with 4 GB of memory and 1080 GTX with 12 GB of memory).

Although I managed to compile the project successfully at some point, I encountered the following runtime error [ F1103 12:03:05.402940 10457 syncedmem.hpp:19] Check failed: error == cudaSuccess (30 vs. 0) unknown error ]
using the 1080 GTX on Ubuntu 14.04 with CUDA 7 .

Since I suspect that the main issue here is compatibility with other library versions, could you please share the details about the specific system setup you used to develop this code.

Got error while using test.py

While I was trying to run

test.py I got following errror

Traceback (most recent call last):
  File "test.py", line 15, in <module>
    from fast_rcnn.config import cfg, get_output_dir
ImportError: No module named fast_rcnn.config

Is this only my problem.

Also it will be really helpful if there is an description that how we can try to see out put of images.

Can the primary regions in an image be automatically generated?

Hello, I found the primary regions you use in the paper Contextual Action Recognition with R*CNN is annotated data. Is there some method that can automatically generate the primary regions in an input image? if not, how can we recognize the actions in an input image without annotation information?

about reference model

Hi, dear gkioxari, your network setting is based on alexnet, however you used vgg16 as your reference_model. Have you tried fine-tuning some caffemodel of alexnet?

In building RRC caffe , make all error in caffe

user@BLTSP03119:~/Documents/rrc_detection$ make all
PROTOC src/caffe/proto/caffe.proto
CXX .build_release/src/caffe/proto/caffe.pb.cc
CXX src/caffe/syncedmem.cpp
CXX src/caffe/data_reader.cpp
CXX src/caffe/layer_factory.cpp
CXX src/caffe/blob.cpp
CXX src/caffe/solver.cpp
CXX src/caffe/internal_thread.cpp
CXX src/caffe/net.cpp
CXX src/caffe/util/blocking_queue.cpp
CXX src/caffe/util/upgrade_proto.cpp
CXX src/caffe/util/hdf5.cpp
CXX src/caffe/util/db_leveldb.cpp
CXX src/caffe/util/cudnn.cpp
CXX src/caffe/util/insert_splits.cpp
CXX src/caffe/util/im_transforms.cpp
CXX src/caffe/util/io.cpp
CXX src/caffe/util/db_lmdb.cpp
CXX src/caffe/util/im2col.cpp
CXX src/caffe/util/bbox_util.cpp
CXX src/caffe/util/db.cpp
CXX src/caffe/util/math_functions.cpp
CXX src/caffe/util/sampler.cpp
CXX src/caffe/util/benchmark.cpp
CXX src/caffe/util/signal_handler.cpp
CXX src/caffe/common.cpp
CXX src/caffe/layers/prelu_layer.cpp
CXX src/caffe/layers/cudnn_conv_layer.cpp
CXX src/caffe/layers/accuracy_layer.cpp
CXX src/caffe/layers/hdf5_data_layer.cpp
CXX src/caffe/layers/power_layer.cpp
CXX src/caffe/layers/reduction_layer.cpp
CXX src/caffe/layers/softmax_loss_layer.cpp
CXX src/caffe/layers/multibox_loss_layer.cpp
In file included from ./include/caffe/common.hpp:19:0,
from ./include/caffe/blob.hpp:8,
from ./include/caffe/layers/multibox_loss_layer.hpp:8,
from src/caffe/layers/multibox_loss_layer.cpp:6:
./include/caffe/util/device_alternate.hpp:15:36: error: no ‘void caffe::MultiBoxLossLayer::Forward_gpu(const std::vector<caffe::Blob>&, const std::vector<caffe::Blob>&)’ member function declared in class ‘caffe::MultiBoxLossLayer’
const vector<Blob>& top) { NO_GPU; }
^
src/caffe/layers/multibox_loss_layer.cpp:574:1: note: in expansion of macro ‘STUB_GPU’
STUB_GPU(MultiBoxLossLayer);
^
./include/caffe/util/device_alternate.hpp:19:39: error: no ‘void caffe::MultiBoxLossLayer::Backward_gpu(const std::vector<caffe::Blob
>&, const std::vector&, const std::vector<caffe::Blob>&)’ member function declared in class ‘caffe::MultiBoxLossLayer’
const vector<Blob
>& bottom) { NO_GPU; }
^
src/caffe/layers/multibox_loss_layer.cpp:574:1: note: in expansion of macro ‘STUB_GPU’
STUB_GPU(MultiBoxLossLayer);
^
make: *** [.build_release/src/caffe/layers/multibox_loss_layer.o] Error 1


am using in cpu mode .
i made cpu=1
and use_opencv=1

       in makefile.config

libprotobuf ERROR google/protobuf/text_format.cc:245

when I test a R_CNN classifier,I meet with a problem :
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 401:21: Message type "caffe.LayerParameter" has no field named "roi_pooling_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1113 15:08:06.049188 13991 upgrade_proto.cpp:928] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: models/VGG16_RstarCNN/test.prototxt
*_* Check failure stack trace: ***
Aborted (core dumped)

I had test with mnist dataset and it does not occur any error.
I haven't any clue to solve this problem.

Is source code about MP II dataset available ?

I read the paper Contextual Action Recognition with R*CNN and run the test code on PASCAL VOC and Stanford40 dataset. but i can not find the test code to run and test MPII .
Is the test and train code on the MPII dataset available? : )

question about 'background' class in pasvoc?

@gkioxari
in this line, it ignore the 0-index. i think if use the pascal_voc_2012 for training, the class list shoud be ('jumping', 'phoning', 'playinginstrument', 'reading', 'ridingbike', 'ridinghorse', 'running', 'takingphoto', 'usingcomputer', 'walking', 'other').
So num_classes in this line is 11. if it ignores the 0-index, which means ignore 'jumping' class when applying mean subtraction and stds division. I think the column 0 in variable targets means the class_id: 0 for jumping, 1 for phoning ... 10 means other. Does 0 also means backgroud ? I find both background and jumping will use 0 for class-index in the column 0 of variable targets. Is that right?
I find this line creates an zero array, the colomn 0 of the variable targets means either groundtruth label is 0, or IOU is below threshold. so after this methon returns, we can not tell what does 0 means. should I change it from
targets = np.zeros((rois.shape[0], 5), dtype=np.float32) to
targets = np.zeros((rois.shape[0], 5), dtype=np.float32)
targets[:,0] = -1
Do I miss something important?

Would you give some explanations in details ? Thank you very much~

test image

Thanks for your share.
Could you tell me after train_net and test_net, how to draw the figure like" Top predictions on the PASCAL VOC Action test set"?
thank you

Iteration and test problem

Dear gkioxari,
Thanks for your great work.
However when I try , I found that 'Iteration' in the file ' solver.prototxt' doesn't work neither in 'train.py'
another problem:
./tools/test_net.py --gpu 0 --def models/VGG16_RstarCNN/test.prototxt --net output/default/voc_2012_trainval/vgg16_fast_rstarcnn_joint_iter_40000.caffemodel
IOError: [Errno 2] No such file or directory: '/home/user1/RstarCNN/tools/../lib/datasets/../../data/VOCdevkit2012/results/VOC2012/Action/comp10_action_val_jumping.txt'
actually there is no sub file 'results' in the file 'VOCdevkit2012'.
could you help me ? Thank you very much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.