jshtok / repmet Goto Github PK
View Code? Open in Web Editor NEWFew-shot detection for visual categories
License: Apache License 2.0
Few-shot detection for visual categories
License: Apache License 2.0
Hi @jshtok,
Would you mind show us your way to have the test result for PASCAL VOC. In your instruction and create the benchmark, we found that you only provide your way to test on 214 classes of ImageNetLOC by the end-to-end model. Because there are images from ImageNetLOC in voc_inloc_roidb.pkl, not have any images from PASCAL VOC. Do I need to create a same file for PASCAL? If you have one, could you upload it on your github which you use for your paper?
dear jshtok:
these days i am working on reproducing the detection results. i am puzzled that i do find episodes generating is done while testing, but i can't found it while end to end training. could you please help me to pick out the episodes generating codes while training, or there is just no episode training. if the later , could you please mind me the training policy. best thanks!
The google drive link to download the checkpoint doesn't work anymore. Could you fix the link? Or even better save the checkpoint here with git lfs?
Hi, @jshtok,
In the paper the foreground probability and background probability are computed as follows,
And actually in the code you seem to output another probability by adding a softmax activation after the above probability scaled by cfg.network.SOFTMAX_MUL during training
RepMet/fpn/symbols/resnet_v1_101_fpn_dcn_rcnn_oneshot_v3.py
Lines 1159 to 1160 in d5b13e0
RepMet/fpn/symbols/resnet_v1_101_fpn_dcn_rcnn_oneshot_v3.py
Lines 1253 to 1261 in d5b13e0
Hi@jshtok, thanks for sharing your work!
But one question is really confusing for me,
How can ROI proposal still work well when shift to object from new categories in few shot learning phase?
In the normal Faster R-CNN, the ROI proposal layer generate proposals by classifying fore-/background, when a new category comes (with high visual difference), I think it may classified as background. And in your code, it seems your ROI proposal layer is pretty the same as normal Faster R-CNN. (maybe you can tell me where is the difference)
In the paper, you start from pooled feature vector directly as in below picture.
So how can the network generate good proposals for new categories in few shot learning with out fine-turning?
thanks!
Hi, @jshtok:
Thanks for your code. I have a problem.
When I input: python fpn/few_shot_benchmark.py --test_name=RepMet_inloc --Nshot=1 --Nway=5 --Nquery_cat=10 --Nepisodes=500 --do_finetune=1 --num_finetune_epochs=5 --lr=5e-4
It will report error:
I found this because when balance_classes() is done in loader.py,
Could you explain this to me? I am grateful.
Thanks for your excellent work. I am very interested in the work and I have some questions about the detection task.
Unlike the classification task which only has only one embedded feature vector, the detection task will generate many ROIs. In the paper, you said you replaced representatives R in the DML subnet with the embedding vectors computed from the ROIs. For one-shot learning, the representative R of one category in the support set is generated by averaging the ROI features in the support set of this category?
Another question, there is only one instance in one image of the imagenet-loc dataset. However, Figure 1 shows that there are more than one instances. I wonder how to use the sub-net to complete the classification of many instances of several categories in a single image because there are so many ROIs of different categories after RPN. Is the model also trained in the N-way k-shot episode way? If so and one image contains more than one categories, how to choose the support set and query set?
Hope to hear from you soon!
Hi, jshtok
I tried to train your model,and here is a little problem that i found:
Traceback (most recent call last):
File "./experiments/fpn_end2end_train_test.py", line 31, in
train_end2end.main()
File "./experiments/../fpn/train_end2end.py", line 292, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
File "./experiments/../fpn/train_end2end.py", line 120, in train_net
num_ex_per_class = num_ex_per_class)
File "./experiments/../fpn/../lib/utils/load_data.py", line 10, in load_gt_roidb
per_category_epoch_max=per_category_epoch_max,classes_list_fname=classes_list_fname,num_ex_per_class=num_ex_per_class)
File "./experiments/../fpn/../lib/dataset/imagenet.py", line 58, in init
synsets = sio.loadmat(os.path.join(self.devkit_path, 'data', 'meta_'+base_modifier+'.mat'))
File "/home/cx/.conda/envs/RepMet/lib/python2.7/site-packages/scipy/io/matlab/mio.py", line 207, in loadmat
MR, file_opened = mat_reader_factory(file_name, appendmat, **kwargs)
File "/home/cx/.conda/envs/RepMet/lib/python2.7/site-packages/scipy/io/matlab/mio.py", line 62, in mat_reader_factory
byte_stream, file_opened = _open_file(file_name, appendmat)
File "/home/cx/.conda/envs/RepMet/lib/python2.7/site-packages/scipy/io/matlab/mio.py", line 37, in _open_file
return open(file_like, 'rb'), True
IOError: [Errno 2] No such file or directory: './data/imagenet/ILSVRC/devkit/data/meta_clsloc.mat'
I am looking forward to your replay.
Hi, @jshtok I find there is only testing code in your repository, could you please add usage of your code about how to train your model? thanks.
Hi, @jshtok, @leokarlin
I noticed that in CVPR paper you mentioned the pretrained model used for few shot detection fine-tuning on ImageNet is pretrained on COCO. I am curious about how you can train a DCN-FPN detection model from scratch without GN or SyncBN. Could you please be so kind as to upload your log on training DCN-FPN model with COCO from scratch? Thanks.
Hi @jshtok,
I am sorry for bothering you again. I noticed that the backbone of few shot detection in RepMet was trained on COCO, so I was wondering what I need to do if I want to train RepMet and evaluate it on COCO again. Could you show me a way to do so? And Do I need to train COCO and ImageNetLOC together to leverage the knowledge from ImageNetLOC like the one you did for PASCAL VOC or I just train RepMet on COCO not combining it with ImageNetLOC?
Thank you very much!
Sincerely,
Duynn
Hi, @jshtok ,
When I try to reproduce the VOC experiment, I used the new 'resnet_v1_101_voc0712_trainval_fpn_dcn_oneshot_end2end_ohem_8_orig.yaml' file and I changed the dataset to VOC. I use the 'python experiment/fpn_test.py' command. I find there are some files are missing. Can you provide them for me. I think the '/data/cache/voc_val_partial_gt_roidb.pkl' and 'VOCdevkit/VOCval/ImageSets/Main/partial.txt' are missing. And I don't know if there are some other files that may be missing. Can you help me to solve this problem. Thanks very much!
My email is [email protected]. Hope to hear from you soon.
Best regards,
Yukuan Yang
Hi @jshtok,
I am sorry for bothering you again.
I want to get embedding features which is the output of DML embedding module for pointed objects. Following your code, I see that after RepMet inferences an image, it will output 2000 proposals and 2000 embedding features corresponding to these 2000 proposals. My problem is where I can show RepMet the groundtruth area to get the embedding features instead of the areas of 2000 proposals in your code.
I hope to hear from you soon! Thank you very much!
Best regards,
Duynn
Hi, @jshtok
I find in the paper that there are only cross entropy loss and embed loss for few shot detection (as for the classification branch)
but in your code there is another loss called ADDITIONAL_LINEAR_CLS_LOSS.
RepMet/fpn/symbols/resnet_v1_101_fpn_dcn_rcnn_oneshot_v3.py
Lines 1167 to 1168 in d5b13e0
RepMet/fpn/symbols/resnet_v1_101_fpn_dcn_rcnn_oneshot_v3.py
Lines 1205 to 1206 in d5b13e0
Dear authors,
Thank you for your open-source code, I am very interested in your work. However, can you please share how much memory you need to train your model to get results in your CVPR and how many GPUs you take to train, especially for COCO training? I have problems to train models following your instructions, btw my GPU has 12G.
Dear @jshtok,
Could you mind share your setting up to train on an own dataset. I replace Pascal VOC by my own dataset following the setup of Pascal and ImagenetLOC. During training, I have a value error like this one:
Do you know why this is a case. Could you please help me out! Btw, in the yaml file, there is a line named "max_num_extra_classes", my question is the value of this parameter must be identical to NUM_CLASSES or not?
Thank you so much!
Hi, Shtok, sorry to bother you again.
When I run the code of 5-way 1-shot, the experiment results are close to what the paper reports. However, when I run the code with the command 'python fpn/few_shot_benchmark.py --test_name=RepMet_inloc --Nshot=5 --Nway=5 --Nquery_cat=10 --Nepisodes=500' (5-way 5-shot), the AP is much smaller (48.3%) than what the paper reports (68.8%). I wonder why it happens. Can you help me to solve this problem. Thanks very much!
Sincerely,
Yukuan Yang
Hi Dr Shtok, thanks for your work! Would you mind adding the missing disp_dets or illustrate what it does?
Hi Joseph,
I'm a bit confused about the experimental protocol for the ImageNet-based benchmark used in your work and would be very grateful for clarification.
Specifically, when detecting a class (in some episode) do you detect this class on the query images corresponding only to the same class (option 1) or on all query images of the episode (option 2)?
The RepMet paper (the second paragraph of Section 5.2) makes me believe in option 2. File data/Imagenet_LOC/episodes/epi_inloc_in_domain_1_5_10_500.pkl
(1-shot 5-way setting) that you kindly provided supports the same conclusion as all the query image of the episode are glued together to one list.
However, the code seems to support option 1.
Function test_model from few_shot_benchmark seems to detect a class only on the images corresponding to the same category query_images[cat]
:
RepMet/fpn/few_shot_benchmark.py
Lines 668 to 693 in 9bdc3f2
output/RepMet_inloc_1shot_5way_10qpc_500epi_episodes.npz
, which has all the query images separated by category.
Could you please clarify which protocol was used to obtain the results of the paper?
Best,
Anton
Hi jshtok,
In line 272: info_str = imdb.evaluate_detections(detections=all_boxes,logger=logger)
, and line 345: info_str = imdb.evaluate_detections(all_boxes,logger)
, where function evaluate_detections
(in imagenet.py or pascal_voc.py) only has one parameter (detections), but two parameters are given.
This seem to trigger some errors when running "fpn/test.py"!
Hi @jshtok, thanks for the great paper. I just read it and there is something I did not quite understand.
So, basically, when the training starts, the representatives of each class are just the random output of a fully connected layer and, over time, they are learned in such a way that the closest to the input belongs to the correct class.
This method, paired with the two losses, ensures that the DML module learns to map examples of the same class closer together.
While testing on unseen classes, the representatives are thrown away and in their place, we use the embeddings (outputs of the DML module) of some examples for each class.
Did I understand it correctly?
My question is: why didn't you use actual examples during training? Couldn't learning the representatives and the distance metric together hurt the performance on unseen examples?
not available from https://ibm.box.com/s/g7xcm0hkyec2rd4hixfbthke8lunsyrv?
Hello, I am looking forward to your code. When do you release your code?
I read through a few of the closed and open issues and I am observing an issue similar to #9.
Setup
I am trying to work through the examples listed in the README with the ImageNet data (I followed the link to download), set up the paths accordingly and have not changed anything aside from renaming the filepaths in the pickle files downloaded from the google drive (voc_inloc_roidb.pkl
and voc_inloc_gt_roidb_.pkl
). Aside from renaming paths, I am using the version of few_shot_benchmark.py
and the config file that is currently on the master branch.
Questions
/experiments/cfgs/resnet_v1_101_voc0712_trainval_fpn_dcn_oneshot_end2end_ohem_8.yaml
?balance_classes
supposed to be set to false
when fine-tuning with episodic data (it is set to true
in the config)?Issue
I am encountering the out of index error that was observed in #9 and am confused by the discussion on that thread. Here is what I have tried to run
python fpn/few_shot_benchmark.py \
--test_name=RepMet_inloc \
--Nshot=1 --Nway=5 --Nquery_cat=10 --Nepisodes=500 \
--do_finetune=1 --num_finetune_epochs=5 --lr=5e-4
Namely, I am not sure I understand this comment that @jshtok made (or if it is relevant to the solution):
You mentioned (in the email version of this ticket) you have the setting
cfg.dataset.NUM_CLASSES=127
but in the .yaml configuration file it is set
dataset:
NUM_CLASSES: 122
so the NUM_CLASSES should be 122. I don't expect this error to happen if the NUM_CLASSES is correct, please check if this is the case.
The NUM_CLASSES
is changed from 122 (from the YAML file) to 127 when add_reps_to_model
is called and new_cats_to_beginning
is hard-coded to be False
so unless these parameters are intended to be set to different values, it does not surprise me that NUM_CLASSES = 127 (122 + Nway)
here.
RepMet/fpn/few_shot_benchmark.py
Lines 661 to 663 in 9bdc3f2
Here is what I have noticed when trying to debug this issue:
balance_classes
is set to True
in the configuration yaml file. We enter into the balance_classes
method in the PyramidAnchorIterator
class. This ends up excluding all the examples within my first batch resulting in self.size
to be 0 Lines 268 to 309 in 9bdc3f2
self.size
is now 0, self.cur_to
is also 0 resulting in a length 0 slice of the roidb
Lines 424 to 426 in 9bdc3f2
roidb
, index 0 ends up being invalid because the list is empty, which results in the error raised in #9Lines 443 to 444 in 9bdc3f2
balance_classes
parameter seems redundant, so I have also tried setting this value to False
, which avoids the index issue...but other issues ariseNaN values in _filter_boxes !
Error in CustomOp.forward: Traceback (most recent call last):
File "/home/user/.local/lib/python2.7/site-packages/mxnet/operator.py", line 789, in forward_entry
aux=tensors[4])
File "fpn/operator_py/proposal_target.py", line 57, in forward
assert np.all(all_rois[:, 0] == 0), 'Only single item batches are supported'
AssertionError: Only single item batches are supported
(I emailed @jshtok briefly about this a few weeks ago as I was seeing similar issues when trying to fine-tune on my own data. I did not resolve the issue and figured I'd try to get this up and running on ImageNet first and am running into similar issues).
Any ideas on how to fix this?
Hello, jshtok.
When I try to reproduce the work, the model seems to learn well. However, the test results are not as good as what the paper reported. So I want to know if you have a training log file. Besides, I want to know whether I should change some parameters like lr when I use multi-gpus.
Hope to hear from you soon!
Sincerely,
Yukuan Yang
Hi @jshtok,
As in your paper, RepMet only provides a way to test on novel classes following episodes. After each episode, new novel classes will replace the old one. Therefore, I have a problem if there is any way to test RepMet on benchmarks such as PASCAL VOC or COCO as regular detection for base and novel classes because I have two problems as follows:
(1) Assuming that I follow to train RepMet on PASCAL VOC that 15 classes for base classes and the rest of 5 classes is for novel classes or on COCO, 20 classes overlapped with PASCAL as novel classes and the rest is base classes. A problem here is how I can test RepMet on each image of test set. In ImageNetLOC, there is only one category with one or more instances in an image but with COCO or PASCAL dataset, we may have many categories including base and novel categories in an image.
(2) Could you please show me a way to get the result with the base classes of ImagaNetLOC in Table 4 (I have tried to do it but mAP just got 0%) and the results of 100 classes of ImageNet-LOC with no episode fine-tuning in Table 3?
I look forward to hearing from you. Hope you can help me out!
Best regards,
Duynn
Hi Dr Shtok, would you mind giving some pointer what these packages are? Much appreciated.
Hi @jshtok, thanks for sharing the project! A quick question about other applications - about the logo detection use case, do you mind sharing the specs of your dataset (how many annotated images of 206 classes?) and the corresponding results? Do you know what kind of applications/datasets are best suited for RepMet and what not? Thanks so much!
Dear @jshtok ,
It seems your code only print out the accuracy on the command screen. Would you mind support us to save detection results such as pickle files or something like Detectron or SNIPER which is also mxnet.
And How I can get mAP on Pascal VOC and COCO?
Thank you so much!
Dear Joseph, thanks for sharing the code of your interesting work. I'm trying to get the results on my own dataset. After preparing the dataset as your format, I got the following error:
Box #0 - no overlapping detection boxes found
Box #1 - no overlapping detection boxes found
Box #2 - no overlapping detection boxes found
....
I guess it might be because my dataset (aerial images) is so different from the pascal voc dataset and I need to retrain the network. I checked the file you provided and found a file named 'fpn_pascal_imagenet-0015.params'. Is it the parameters of the pre-trained FPN-DCN network? In my case, if I understand correctly, I need to retrain the FPN-DCN model to get the new parameter file like this one. Please help me to confirm this. Also, would you please provide the code for generating these parameters?
Thanks for your attention.
How do I test in a single image?
Hi @jshtok,
According to a training issue, I found that you mentioned RepMet used 200 samples in per class for the first 101 classes of ImageNetLOC and 20 classes of PASCAL and there is a parameter in config.py to control them. Is it right? My problem here is that these 200 samples are fixed images chosen from the total images of ImageNetLOC and PASCAL for each class in each epoch or 200 samples are picked up randomly from the total images in each epoch?
Hope to hear your answer soon!
Best regards,
Duynn
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.