GithubHelp home page GithubHelp logo

jshtok / repmet Goto Github PK

View Code? Open in Web Editor NEW
108.0 108.0 18.0 10.97 MB

Few-shot detection for visual categories

License: Apache License 2.0

Python 8.41% Makefile 0.01% C 0.06% Cuda 1.94% C++ 0.01% Jupyter Notebook 89.47% Cython 0.11%

repmet's People

Contributors

josephs-cvar avatar jshtok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

repmet's Issues

PASCAL VOC result?

Hi @jshtok,

Would you mind show us your way to have the test result for PASCAL VOC. In your instruction and create the benchmark, we found that you only provide your way to test on 214 classes of ImageNetLOC by the end-to-end model. Because there are images from ImageNetLOC in voc_inloc_roidb.pkl, not have any images from PASCAL VOC. Do I need to create a same file for PASCAL? If you have one, could you upload it on your github which you use for your paper?

is episode training used while training

dear jshtok:
these days i am working on reproducing the detection results. i am puzzled that i do find episodes generating is done while testing, but i can't found it while end to end training. could you please help me to pick out the episodes generating codes while training, or there is just no episode training. if the later , could you please mind me the training policy. best thanks!

Drive link broken

The google drive link to download the checkpoint doesn't work anymore. Could you fix the link? Or even better save the checkpoint here with git lfs?

about output probability

Hi, @jshtok,

In the paper the foreground probability and background probability are computed as follows,
image
image
And actually in the code you seem to output another probability by adding a softmax activation after the above probability scaled by cfg.network.SOFTMAX_MUL during training

if cfg.network.SOFTMAX_ENABLED:
cls_score = mx.sym.broadcast_mul(self.get_constant_symbol(cfg.network.SOFTMAX_MUL), cls_score)

and in testing you output two sets of probability, one is the original one as in the paper cls_score_orig, and the other is scaled by cfg.network.SOFTMAX_MUL cls_prob.
if cfg.network.SOFTMAX_ENABLED:
cls_score_orig = mx.sym.broadcast_div(cls_score_orig, self.get_constant_symbol(cfg.network.SOFTMAX_MUL))
cls_score_orig = mx.sym.Reshape(data=cls_score_orig, shape=(cfg.TEST.BATCH_IMAGES, -1, num_classes))
group = mx.sym.Group([rois,
cls_prob,
bbox_pred,
mx.sym.identity(batch_embed,name='psp_final_embed'),
mx.sym.identity(cls_score_orig,name='cls_score')])

So what probability do you use for the final output detection results? Thanks.

How ROI proposal works when shift to new class in few shot learning?

Hi@jshtok, thanks for sharing your work!
But one question is really confusing for me,
How can ROI proposal still work well when shift to object from new categories in few shot learning phase?
In the normal Faster R-CNN, the ROI proposal layer generate proposals by classifying fore-/background, when a new category comes (with high visual difference), I think it may classified as background. And in your code, it seems your ROI proposal layer is pretty the same as normal Faster R-CNN. (maybe you can tell me where is the difference)

In the paper, you start from pooled feature vector directly as in below picture.
Screenshot from 2019-12-28 19-24-40

So how can the network generate good proposals for new categories in few shot learning with out fine-turning?

thanks!

A problem about testing with fine-tune

Hi, @jshtok:
Thanks for your code. I have a problem.
When I input: python fpn/few_shot_benchmark.py --test_name=RepMet_inloc --Nshot=1 --Nway=5 --Nquery_cat=10 --Nepisodes=500 --do_finetune=1 --num_finetune_epochs=5 --lr=5e-4
It will report error:
图片
I found this because when balance_classes() is done in loader.py,
图片

Could you explain this to me? I am grateful.

How does the model be trained and tested on the images with many instances of different categories?

Thanks for your excellent work. I am very interested in the work and I have some questions about the detection task.
Unlike the classification task which only has only one embedded feature vector, the detection task will generate many ROIs. In the paper, you said you replaced representatives R in the DML subnet with the embedding vectors computed from the ROIs. For one-shot learning, the representative R of one category in the support set is generated by averaging the ROI features in the support set of this category?

Another question, there is only one instance in one image of the imagenet-loc dataset. However, Figure 1 shows that there are more than one instances. I wonder how to use the sub-net to complete the classification of many instances of several categories in a single image because there are so many ROIs of different categories after RPN. Is the model also trained in the N-way k-shot episode way? If so and one image contains more than one categories, how to choose the support set and query set?

Hope to hear from you soon!

Hi, jshtok

Hi, jshtok
I tried to train your model,and here is a little problem that i found:

Traceback (most recent call last):
File "./experiments/fpn_end2end_train_test.py", line 31, in
train_end2end.main()
File "./experiments/../fpn/train_end2end.py", line 292, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
File "./experiments/../fpn/train_end2end.py", line 120, in train_net
num_ex_per_class = num_ex_per_class)
File "./experiments/../fpn/../lib/utils/load_data.py", line 10, in load_gt_roidb
per_category_epoch_max=per_category_epoch_max,classes_list_fname=classes_list_fname,num_ex_per_class=num_ex_per_class)
File "./experiments/../fpn/../lib/dataset/imagenet.py", line 58, in init
synsets = sio.loadmat(os.path.join(self.devkit_path, 'data', 'meta_'+base_modifier+'.mat'))
File "/home/cx/.conda/envs/RepMet/lib/python2.7/site-packages/scipy/io/matlab/mio.py", line 207, in loadmat
MR, file_opened = mat_reader_factory(file_name, appendmat, **kwargs)
File "/home/cx/.conda/envs/RepMet/lib/python2.7/site-packages/scipy/io/matlab/mio.py", line 62, in mat_reader_factory
byte_stream, file_opened = _open_file(file_name, appendmat)
File "/home/cx/.conda/envs/RepMet/lib/python2.7/site-packages/scipy/io/matlab/mio.py", line 37, in _open_file
return open(file_like, 'rb'), True
IOError: [Errno 2] No such file or directory: './data/imagenet/ILSVRC/devkit/data/meta_clsloc.mat'

I am looking forward to your replay.

training code

Hi, @jshtok I find there is only testing code in your repository, could you please add usage of your code about how to train your model? thanks.

About COCO pretrain

Hi, @jshtok, @leokarlin

I noticed that in CVPR paper you mentioned the pretrained model used for few shot detection fine-tuning on ImageNet is pretrained on COCO. I am curious about how you can train a DCN-FPN detection model from scratch without GN or SyncBN. Could you please be so kind as to upload your log on training DCN-FPN model with COCO from scratch? Thanks.

How to train and evaluate RepMet on COCO dataset?

Hi @jshtok,

I am sorry for bothering you again. I noticed that the backbone of few shot detection in RepMet was trained on COCO, so I was wondering what I need to do if I want to train RepMet and evaluate it on COCO again. Could you show me a way to do so? And Do I need to train COCO and ImageNetLOC together to leverage the knowledge from ImageNetLOC like the one you did for PASCAL VOC or I just train RepMet on COCO not combining it with ImageNetLOC?

Thank you very much!
Sincerely,
Duynn

Are there some missing files in VOC experiment?

Hi, @jshtok ,
When I try to reproduce the VOC experiment, I used the new 'resnet_v1_101_voc0712_trainval_fpn_dcn_oneshot_end2end_ohem_8_orig.yaml' file and I changed the dataset to VOC. I use the 'python experiment/fpn_test.py' command. I find there are some files are missing. Can you provide them for me. I think the '/data/cache/voc_val_partial_gt_roidb.pkl' and 'VOCdevkit/VOCval/ImageSets/Main/partial.txt' are missing. And I don't know if there are some other files that may be missing. Can you help me to solve this problem. Thanks very much!
My email is [email protected]. Hope to hear from you soon.
Best regards,
Yukuan Yang

How to use RepMet as feature extractor?

Hi @jshtok,

I am sorry for bothering you again.

I want to get embedding features which is the output of DML embedding module for pointed objects. Following your code, I see that after RepMet inferences an image, it will output 2000 proposals and 2000 embedding features corresponding to these 2000 proposals. My problem is where I can show RepMet the groundtruth area to get the embedding features instead of the areas of 2000 proposals in your code.

I hope to hear from you soon! Thank you very much!
Best regards,
Duynn

about loss

Hi, @jshtok

I find in the paper that there are only cross entropy loss and embed loss for few shot detection (as for the classification branch)
image
but in your code there is another loss called ADDITIONAL_LINEAR_CLS_LOSS.

if cfg.network.ADDITIONAL_LINEAR_CLS_LOSS:
cls_score_lin = mx.symbol.FullyConnected(name='cls_score_lin', data=fc_new_2_relu, num_hidden=num_classes)

if cfg.network.ADDITIONAL_LINEAR_CLS_LOSS:
cls_prob_lin = mx.sym.SoftmaxOutput(name='cls_prob_lin', data=cls_score_lin, label=labels_ohem, normalization='valid', use_ignore=True, ignore_label=-1)

What is this loss used for ? Thanks.

How much memory for GPU to train?

Dear authors,
Thank you for your open-source code, I am very interested in your work. However, can you please share how much memory you need to train your model to get results in your CVPR and how many GPUs you take to train, especially for COCO training? I have problems to train models following your instructions, btw my GPU has 12G.

How to train on an own dataset?

Dear @jshtok,

Could you mind share your setting up to train on an own dataset. I replace Pascal VOC by my own dataset following the setup of Pascal and ImagenetLOC. During training, I have a value error like this one:
image
Do you know why this is a case. Could you please help me out! Btw, in the yaml file, there is a line named "max_num_extra_classes", my question is the value of this parameter must be identical to NUM_CLASSES or not?
Thank you so much!

Is there any difference between 5-way 1-shot and 5-way 5-shot in the code?

Hi, Shtok, sorry to bother you again.

When I run the code of 5-way 1-shot, the experiment results are close to what the paper reports. However, when I run the code with the command 'python fpn/few_shot_benchmark.py --test_name=RepMet_inloc --Nshot=5 --Nway=5 --Nquery_cat=10 --Nepisodes=500' (5-way 5-shot), the AP is much smaller (48.3%) than what the paper reports (68.8%). I wonder why it happens. Can you help me to solve this problem. Thanks very much!

Sincerely,
Yukuan Yang

What is the experimental protocol for evaluating on ImageNet-based benchmark?

Hi Joseph,
I'm a bit confused about the experimental protocol for the ImageNet-based benchmark used in your work and would be very grateful for clarification.
Specifically, when detecting a class (in some episode) do you detect this class on the query images corresponding only to the same class (option 1) or on all query images of the episode (option 2)?

The RepMet paper (the second paragraph of Section 5.2) makes me believe in option 2. File data/Imagenet_LOC/episodes/epi_inloc_in_domain_1_5_10_500.pkl (1-shot 5-way setting) that you kindly provided supports the same conclusion as all the query image of the episode are glued together to one list.

However, the code seems to support option 1.
Function test_model from few_shot_benchmark seems to detect a class only on the images corresponding to the same category query_images[cat]:

def test_model(perf_stats,epi_cats,query_images,cat_indices,roidb,d,sym_ext, arg_params, aux_params,epi_root,epi_num,epi_cats_names, display=0):
for cat in epi_cats:
if args.nqc>0 and args.nqc< len(query_images[cat]):
q_images = query_images[cat][0:args.nqc]#random.sample(query_images[cat],args.nqc)
else:
q_images = query_images[cat]
for nImg in q_images:
img_fname = roidb[nImg]['image']
gt_classes = roidb[nImg]['gt_classes']
if display==1:
img_cats = [epi_cats_names[i] for i in range(args.Nway) if epi_cats[i] in gt_classes]
q_dets = run_detection(sym_ext, arg_params, aux_params, img_fname, img_cats[0], cat_indices, epi_cats_names, epi_root, nImg, epi_num)
else:
q_dets = run_detection(sym_ext, arg_params, aux_params, img_fname, '', cat_indices, [], epi_root, nImg, epi_num)
gt_boxes = np.copy(roidb[nImg]['boxes'])
# legacy from Pascal dataset
gt_boxes_test = []
gt_classes_test = []
for gt_box, gt_class in zip(gt_boxes, gt_classes):
if gt_class in epi_cats:
gt_boxes_test += [gt_box]
gt_classes_test += [gt_class]
gt_classes_test = np.asarray(gt_classes_test)
gt_boxes_test = np.asarray(gt_boxes_test)
d = perf_stats.comp_epi_stats_m(d, q_dets, gt_boxes_test, gt_classes_test, epi_cats, args.ovthresh)
return d

The code uses another file from your package:
output/RepMet_inloc_1shot_5way_10qpc_500epi_episodes.npz, which has all the query images separated by category.

Could you please clarify which protocol was used to obtain the results of the paper?

Best,
Anton

Is there a mistake in "fpn/core/tester.py"?

Hi jshtok,

In line 272: info_str = imdb.evaluate_detections(detections=all_boxes,logger=logger), and line 345: info_str = imdb.evaluate_detections(all_boxes,logger), where function evaluate_detections (in imagenet.py or pascal_voc.py) only has one parameter (detections), but two parameters are given.

This seem to trigger some errors when running "fpn/test.py"!

Question about the learned representative

Hi @jshtok, thanks for the great paper. I just read it and there is something I did not quite understand.
So, basically, when the training starts, the representatives of each class are just the random output of a fully connected layer and, over time, they are learned in such a way that the closest to the input belongs to the correct class.

This method, paired with the two losses, ensures that the DML module learns to map examples of the same class closer together.

While testing on unseen classes, the representatives are thrown away and in their place, we use the embeddings (outputs of the DML module) of some examples for each class.

Did I understand it correctly?
My question is: why didn't you use actual examples during training? Couldn't learning the representatives and the distance metric together hurt the performance on unseen examples?

finetune with few shot classes

image
The right part of Tab. 3 shows the results of episode fine tuning, and the paper said,
image

Can you explain how you actually fine tune with the episodic data, I am not so sure what is the last layer of the network? Thanks.

Cannot fine-tune with ImageNet

I read through a few of the closed and open issues and I am observing an issue similar to #9.

Setup
I am trying to work through the examples listed in the README with the ImageNet data (I followed the link to download), set up the paths accordingly and have not changed anything aside from renaming the filepaths in the pickle files downloaded from the google drive (voc_inloc_roidb.pkl and voc_inloc_gt_roidb_.pkl). Aside from renaming paths, I am using the version of few_shot_benchmark.py and the config file that is currently on the master branch.

Questions

  • What was the solution for issue #9?
  • Is the fine-tuning example intended to work with the settings currently in the /experiments/cfgs/resnet_v1_101_voc0712_trainval_fpn_dcn_oneshot_end2end_ohem_8.yaml?
  • Specifically, is balance_classes supposed to be set to false when fine-tuning with episodic data (it is set to true in the config)?

Issue
I am encountering the out of index error that was observed in #9 and am confused by the discussion on that thread. Here is what I have tried to run

 python fpn/few_shot_benchmark.py \
--test_name=RepMet_inloc \
--Nshot=1 --Nway=5 --Nquery_cat=10 --Nepisodes=500 \
--do_finetune=1 --num_finetune_epochs=5 --lr=5e-4

Namely, I am not sure I understand this comment that @jshtok made (or if it is relevant to the solution):

You mentioned (in the email version of this ticket) you have the setting
cfg.dataset.NUM_CLASSES=127
but in the .yaml configuration file it is set
dataset:
NUM_CLASSES: 122
so the NUM_CLASSES should be 122. I don't expect this error to happen if the NUM_CLASSES is correct, please check if this is the case.

The NUM_CLASSES is changed from 122 (from the YAML file) to 127 when add_reps_to_model is called and new_cats_to_beginning is hard-coded to be False so unless these parameters are intended to be set to different values, it does not surprise me that NUM_CLASSES = 127 (122 + Nway) here.

arg_params, new_reps, num_classes = add_reps_to_model(arg_params, tot_embeds,from_start=new_cats_to_beginning)
config.dataset.NUM_CLASSES = num_classes

Here is what I have noticed when trying to debug this issue:

  • It looks like the issue comes about when balance_classes is set to True in the configuration yaml file. We enter into the balance_classes method in the PyramidAnchorIterator class. This ends up excluding all the examples within my first batch resulting in self.size to be 0

    RepMet/fpn/core/loader.py

    Lines 268 to 309 in 9bdc3f2

    def balance_classes(self):
    num_ex_per_class = self.cfg.dataset.num_ex_per_class
    cnts = np.zeros((10000))
    sel_set=[]
    sel_set_cats=[]
    if config.dataset.cls_filter_files is not None:
    fls = config.dataset.cls_filter_files.split(':')
    with open(fls[0],'rb') as f:
    cls2id_map = cPickle.load(f)
    with open(fls[1]) as f:
    classes2use = [x.strip().lower() for x in f.readlines()]
    #classes2use = [x.strip() for x in f.readlines()]
    clsIds2use = set()
    for cls in classes2use:
    clsIds2use.add(cls2id_map[cls])
    self.cfg.dataset.clsIds2use = clsIds2use.copy()
    self.cfg.dataset.clsIds2use.add(0)
    for ix, cur in enumerate(self.index):
    roi = self.roidb[cur]
    cats = roi['gt_classes'] - 1 # minus 1 for excluding BG
    if config.dataset.cls_filter_files is not None:
    cats = np.array([x for x in cats if (x+1) in clsIds2use])
    # else:
    # cats = cats[cats < (self.cfg.dataset.NUM_CLASSES-1)]
    if not cats.size:
    continue
    ix = np.argmin(cnts[cats])
    if cnts[cats[ix]] < num_ex_per_class:
    cnts[cats[ix]] += 1
    else:
    continue #not adding more examples, each epoch runs in random order of this
    sel_set.append(cur)
    sel_set_cats.append(cats)
    sel_set=np.array(sel_set)
    p = np.random.permutation(np.arange(len(sel_set)))
    sel_set = sel_set[p]
    self.index = sel_set
    self.size = len(self.index)
    print('total size {0}'.format(self.size))
    • Since self.size is now 0, self.cur_to is also 0 resulting in a length 0 slice of the roidb

      RepMet/fpn/core/loader.py

      Lines 424 to 426 in 9bdc3f2

      cur_from = self.cur
      cur_to = min(cur_from + self.batch_size, self.size)
      roidb = [self.roidb[self.index[i]] for i in range(cur_from, cur_to)]
  • When we get further down to try to index roidb, index 0 ends up being invalid because the list is empty, which results in the error raised in #9

    RepMet/fpn/core/loader.py

    Lines 443 to 444 in 9bdc3f2

    for idx, islice in enumerate(slices):
    iroidb = [roidb[i] for i in range(islice.start, islice.stop)]
  • It seems like since we are fine-tuning using episodic data the balance_classes parameter seems redundant, so I have also tried setting this value to False, which avoids the index issue...but other issues arise
NaN values in _filter_boxes !
Error in CustomOp.forward: Traceback (most recent call last):
  File "/home/user/.local/lib/python2.7/site-packages/mxnet/operator.py", line 789, in forward_entry
    aux=tensors[4])
  File "fpn/operator_py/proposal_target.py", line 57, in forward
    assert np.all(all_rois[:, 0] == 0), 'Only single item batches are supported'
AssertionError: Only single item batches are supported

(I emailed @jshtok briefly about this a few weeks ago as I was seeing similar issues when trying to fine-tune on my own data. I did not resolve the issue and figured I'd try to get this up and running on ImageNet first and am running into similar issues).

Any ideas on how to fix this?

How to test RepMet on COCO or Pascal as regular detection?

Hi @jshtok,

As in your paper, RepMet only provides a way to test on novel classes following episodes. After each episode, new novel classes will replace the old one. Therefore, I have a problem if there is any way to test RepMet on benchmarks such as PASCAL VOC or COCO as regular detection for base and novel classes because I have two problems as follows:
(1) Assuming that I follow to train RepMet on PASCAL VOC that 15 classes for base classes and the rest of 5 classes is for novel classes or on COCO, 20 classes overlapped with PASCAL as novel classes and the rest is base classes. A problem here is how I can test RepMet on each image of test set. In ImageNetLOC, there is only one category with one or more instances in an image but with COCO or PASCAL dataset, we may have many categories including base and novel categories in an image.

(2) Could you please show me a way to get the result with the base classes of ImagaNetLOC in Table 4 (I have tried to do it but mAP just got 0%) and the results of 100 classes of ImageNet-LOC with no episode fine-tuning in Table 3?

I look forward to hearing from you. Hope you can help me out!

Best regards,
Duynn

application - logo dataset and result?

Hi @jshtok, thanks for sharing the project! A quick question about other applications - about the logo detection use case, do you mind sharing the specs of your dataset (how many annotated images of 206 classes?) and the corresponding results? Do you know what kind of applications/datasets are best suited for RepMet and what not? Thanks so much!

Detection results?

Dear @jshtok ,

It seems your code only print out the accuracy on the command screen. Would you mind support us to save detection results such as pickle files or something like Detectron or SNIPER which is also mxnet.

And How I can get mAP on Pascal VOC and COCO?

Thank you so much!

Training on my own dataset and questions about fpn_pascal_imagenet-0015.params

Dear Joseph, thanks for sharing the code of your interesting work. I'm trying to get the results on my own dataset. After preparing the dataset as your format, I got the following error:
Box #0 - no overlapping detection boxes found
Box #1 - no overlapping detection boxes found
Box #2 - no overlapping detection boxes found
....
I guess it might be because my dataset (aerial images) is so different from the pascal voc dataset and I need to retrain the network. I checked the file you provided and found a file named 'fpn_pascal_imagenet-0015.params'. Is it the parameters of the pre-trained FPN-DCN network? In my case, if I understand correctly, I need to retrain the FPN-DCN model to get the new parameter file like this one. Please help me to confirm this. Also, would you please provide the code for generating these parameters?

Thanks for your attention.

Testing

How do I test in a single image?

How many samples does RepMet use in training phase?

Hi @jshtok,

According to a training issue, I found that you mentioned RepMet used 200 samples in per class for the first 101 classes of ImageNetLOC and 20 classes of PASCAL and there is a parameter in config.py to control them. Is it right? My problem here is that these 200 samples are fixed images chosen from the total images of ImageNetLOC and PASCAL for each class in each epoch or 200 samples are picked up randomly from the total images in each epoch?

Hope to hear your answer soon!
Best regards,
Duynn

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.