License: Apache License 2.0

Python 89.17% C++ 3.35% Cuda 6.22% Shell 0.61% Dockerfile 0.09% Cython 0.56%

multimodal-object-detection-via-probabilistic-ensembling's Introduction

Multimodal Object Detection via Probabilistic Ensembling

ECCV 2022 Oral presentation

[project page] [code] [video demo] [paper] [models] [results]

The results of ProbEn are released! (KAIST / FLIR)

Authors: Yi-Ting Chen^*, Jinghao Shi^*, Zelin Ye^*, Christoph Mertz, Deva Ramanan^#, Shu Kong^#

For installation, please check INSTALL.md.

Usage

We provide the training, testing, and visualization code of thermal-only, early-fusion, middle-fusion and Bayesian fusion. Please change the setting for different fusion methods in the code.

Training:

python demo/FLIR/demo_train_FLIR.py

Test mAP:

python demo/FLIR/demo_mAP_FLIR.py

Visualize predicted boxes:

python demo/FLIR/demo_draw_FLIR.py

Probabilistic Ensembling:

First, you should save predictions from different models using demo_FLIR_save_predictions.py

# Example thermal only
python demo/FLIR/demo_FLIR_save_predictions.py --dataset_path /home/jamie/Desktop/Datasets/FLIR/val --fusion_method thermal_only --model_path trained_models/FLIR/models/thermal_only/out_model_thermal_only.pth

# Example early fusion
python demo/FLIR/demo_FLIR_save_predictions.py --dataset_path /home/jamie/Desktop/Datasets/FLIR/val --fusion_method early_fusion --model_path trained_models/FLIR/models/early_fusion/out_model_early_fusion.pth

# Example middle fusion
python demo/FLIR/demo_FLIR_save_predictions.py --dataset_path /home/jamie/Desktop/Datasets/FLIR/val --fusion_method middle_fusion --model_path trained_models/FLIR/models/middle_fusion/out_model_middle_fusion.pth

Then, you can change and load the predictions in demo_probEn.py

python demo/FLIR/demo_probEn.py --dataset_path /home/jamie/Desktop/Datasets/FLIR/val --prediction_path out/  --score_fusion max --box_fusion argmax

For more example usage, please check run.sh file.

If you find our model/method/dataset useful, please cite our work (arxiv manuscript):

@inproceedings{chen2022multimodal,
  title={Multimodal object detection via probabilistic ensembling},
  author={Chen, Yi-Ting and Shi, Jinghao and Ye, Zelin and Mertz, Christoph and Ramanan, Deva and Kong, Shu},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part IX},
  pages={139--158},
  year={2022},
  organization={Springer}
}

multimodal-object-detection-via-probabilistic-ensembling's People

Contributors

Stargazers

Watchers

Forkers

suryagutta chiyongke shelkspa roozbehsanaei aimerykong guofenggitlearning aimicm heitorrapela zbh10 arslan-z potsui99 cstein163 yunfan-chen haoliyoupai09 myc1998 hukefy guanning03 murtazaabidi1 sridevikaza

multimodal-object-detection-via-probabilistic-ensembling's Issues

could you release the FLIR annotations you use in your paper?

🚀 Feature

A clear and concise description of the feature proposal.

Motivation & Examples

Tell us why the feature is useful.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

Note

We only consider adding new features if they are relevant to many users.

If you request implementation of research papers --
we only consider papers that have enough significance and prevalance.

We do not take requests for most projects in the projects/ directory,
because they are research code release that is mainly for other researchers to reproduce results.

Instead of adding features inside detectron2,
you can implement many features by extending detectron2.
The projects/ directory
contains many of such examples.

Could you please share your datasets?

Thanks for your nice work! Could you please share the KAIST and FLIR datasets? Thanks a lot!

Questions about mAP scores on the FLIR Dataset

📚 Documentation Issue

This issue category is for problems about existing documentation, not for asking how-to questions.
Thank you for your great work. Still, there exist some issues that I am concerned about, especially on the FLIR Dataset.

It is mentioned that 'our ProbEn increases AP from prior art 74.6% to 84.4%!' next to Table 4. However, data in the tables indicate that the performance of ProbEn is 83.76 on FLIR.
It is mentioned that, on FLIR, 'Compared to the single-modal detector (Thermal), our learning-based early-
fusion (EarlyFusion) and mid-fusion (MidFusion) produce better performance.' In Table 3, however, Early Fusion has 78.8 mAP while Thermal has 79.24 mAP. In Table 4, Early Fusion has higher mAP on each of the three categories yet lower mAP on 'all', which is confusing.
With due respect, I'd like to point out that methods like CFR and GAFF are trained and tested on FLIR_align Dataset that is provided by the CFR paper rather than the original FLIR. Although the original FLIR might be a more difficult dataset, it is not that suitable to take mAP scores of CFR and GAFF for direct comparision with ProbEn.

Provide a link to an existing documentation/comment/tutorial:
ProbEn: https://arxiv.org/pdf/2104.02904v3.pdf
CFR: https://arxiv.org/pdf/2009.12664v1.pdf
GAFF: https://openaccess.thecvf.com/content/WACV2021/papers/Zhang_Guided_Attentive_Feature_Fusion_for_Multispectral_Pedestrian_Detection_WACV_2021_paper.pdf
How should the above documentation/comment/tutorial improve:
Thank you very much if you can account for the first and second issues.

How is the Kaist Sanitized train-set defined ?

Looking at your paper and the papers that you refer to it's hard to understand what the Kaist Sanitized train-set dataset should be. We created this script to attempt to check that we would get 7601 images as expected by the Sanitized Kaist Dataset, however we obtained 9148 images. Is there something that we are missing?

Quote from your paper “Because the original KAIST dataset contains noisy annotations, the literature introduces a cleaned version of the train/test sets: a sanitized train-set (7,601 examples) [35] and a cleaned test-set (2,252 examples) [38]. We also follow the literature [29] to evaluate under the “reasonable setting” for evaluation by ignoring annotated persons that are occluded (tagged by KAIST) or too small (<55 pixels). We follow this literature for fair comparison with recent methods ”

The data that we are using to check comes from https://soonminhwang.github.io/rgbt-ped-detection/.

Any feedback would be appreciated.

How do your obtain the annotation files for KAIST?

Can you provide more details about this part? Or could you provide your generated KAIST annotations for training and validation? Thanks.

The document about the code of FLIR training

May I ask if the document about the code of FLIR training can be written in more detail? Can these demos be trained with other data sets?

Training on customer dataset

Thanks for the excellent work.

In the paper, you said the RGB and T detectors must be trained with the same annoations.
So the two detectors are trained using paired dataset.
How to fusion the detectors trained with different datasets?
Because we only have own-collected thermal images without RGB ones.

For paired RGB and T cameras, is there any restrictions for FOV overlapping or resolution?
Thanks.

When will you open the source code?

I am very interested in your work. When will you open source code？

customer dataset

Hi, how do I train the model using my own dataset?

Assertion Error in detectron2

In the detectron2 I've installed, there is "assert self.input_format in ["RGB", "BGR"], self.input_format" in detectron2/engine/defaults.py, which lead to a TraceBack. I'm wondering that is I installed the wrong version? Thanks

Thermal model weights

🚀 Feature

Can you share the link to the weights of the trained model on thermal images. "/Bayesian_release/good_model/thermal_only/out_model_iter_15000.pth"

Motivation & Examples

Tell us why the feature is useful.
To reuse the trained model for other applications

How to use demo_probEn.py for the KAIST dataset?

Thank you for your hard work and contributions! I have successfully reproduced the results in Table 1 of your paper using your pre-trained models KAIST_rgb_only.pth and KAIST_thermal_only.pth.
However, when using demo/FLIR/demo_probEn.py to fuse the results of RGB and Thermal with ProbEn and s-avg, I obtained slightly different results:

MR_all = 9.48, MR_day = 11.34 , MR_night = 5.72

But in your paper, it is :

MR_all = 8.67, MR_day = 10.27, MR_night = 5.41

The only modification I made was that I use "bayesian_fusion" but not "bayesian_fusion_multiclass"，and I input "match_score" instead of "match_prob", like:
final_score= bayesian_fusion(np.asarray(match_score))

Did I miss something? Because for KAIST I set the num_classes to 1, match_prob should be equivalent to match_score.

When will you release the codes to run on KAIST?

Hi, I am very interested in your work. I ran the current code on KAIST but I can't get such LAMR mentioned in your paper. When will you release the codes to run on KAIST? Could you share the trained model on KAIST?

Can you provide more details about inferring predictions and fusion results on KAIST test set??

Since the details on training KAIST and validation are much less. Could you please provide more details about this part? Thanks.

有同学们训练了对齐的FLIR数据集的结果么？我的感觉很奇怪。想交流一下，谢谢！

🚀 Feature

A clear and concise description of the feature proposal.

Motivation & Examples

Tell us why the feature is useful.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

Note

We only consider adding new features if they are relevant to many users.

If you request implementation of research papers --
we only consider papers that have enough significance and prevalance.

We do not take requests for most projects in the projects/ directory,
because they are research code release that is mainly for other researchers to reproduce results.

Instead of adding features inside detectron2,
you can implement many features by extending detectron2.
The projects/ directory
contains many of such examples.

demo_train_middle_fusion ERROR

1.When conduct the function:convert_PIL_to_numpy(image, format) ,ValueError: conversion from L to BGRTTT not supported.
2.trainer.model.backbone_2.weight = param_backbone_2 ,the model has no attribute backbone_2.
I heve installed Detectron2 as your instructions,can you tell me what's wrong with it? Thank you.

Significant performance differences between code reproduction and data shown in paper

Thanks for your great work, it helped me to gain a deeper understanding of RGBT fusion. But when reprocuding this code, I met some problems, If you could help me, I would be very grateful. The details are as belows:
Based on demo_ train_ FLIR.py, when reproducing thermal_only, the performance is similar to the data in the paper, but when the methods are changed to early_ fusion and middle_fusion, there is a significant difference between the performance obtained from training and testing compared to the performance given in the paper. What are the possibilities that could lead to this situation? Or what adjustments should be made when reproducing the code?
I only made adjustments to the “method” parameter, the rest are provided by the project code.

Could you release the FLIR annotations applied in your paper?

Call for FLIR annotations used in ProbEN.

Could you please provide the predicted files

@Jamie725 Thanks for your nice work.

Could you please provide the predicted files that contained the bounding box and score of two detectors?
Maybe is the json files in demo_bayesian_fusion.py? If you provided, this project will very clean and easy to follow since we do not need to install the detectors(e.g., Detectron2)

Where is the core codes of ProbEn, such as the Eq(4) and Eq(8), the "bayesian_prior_wt_score_box" in demo_bayesian_fusion.py?

Thanks again for your wonderful paper!

README provided error！

GETTING_ STARTED.md in the root directory provided incorrect test and training instructions. It can only run the built-in detector of detectron2, and cannot perform fusion detection

ERROR : No module named evalKAIST

Firstly, Thank you for your distribution.
I am interested in your distribution so I am trying to implement it. When I run “python demo/KAIST/demo_train_KAIST.py”, I get an error “ ModuleNotFoundError: No module named 'evalKAIST' “.
Could you help me fix it?

pretrained model require

Hello! I want to run the demo 'save_predictions.py' and 'demo_bayesian_fusion.py', but I didn't found the pretrained weights 'out_model_iter_42000.pth'. Where can I get it?Thanks

code error!

When I run 'demo_train_middle_fusion.py', the input has the dimension of 3, which lead to the error 'The size of tensor a (3) must match the size of tensor b (6) at non-singleton dimension 0'.
Maybe there is something wrong with your code.

Cannot run demo/FLIR/demo_train_FLIR.py or any other demo scripts for training/eval (detectron2 installation issue)

❓ How to use this repo to reproduce the results (demo/train)

1.I have used the installation here but I'm running into
`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[1], line 2
1 # import some common detectron2 utilities
----> 2 from detectron2.engine import DefaultPredictor, DefaultTrainer
3 from detectron2.config import get_cfg
4 from detectron2.data import DatasetCatalog, MetadataCatalog

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/engine/init.py:11
6 all = [k for k in globals().keys() if not k.startswith("_")]
9 # prefer to let hooks and defaults live in separate namespaces (therefore not in all)
10 # but still make them available here
---> 11 from .hooks import *
12 from .defaults import *

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/engine/hooks.py:18
15 from fvcore.nn.precise_bn import get_bn_modules, update_bn_stats
17 import detectron2.utils.comm as comm
---> 18 from detectron2.evaluation.testing import flatten_results_dict
19 from detectron2.utils.events import EventStorage, EventWriter
21 from .train_loop import HookBase

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/evaluation/init.py:2
1 # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
----> 2 from .cityscapes_evaluation import CityscapesEvaluator
3 from .coco_evaluation import COCOEvaluator
4 from .FLIR_evaluation import FLIREvaluator

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/evaluation/cityscapes_evaluation.py:10
7 import torch
8 from PIL import Image
---> 10 from detectron2.data import MetadataCatalog
11 from detectron2.utils import comm
13 from .evaluator import DatasetEvaluator

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/data/init.py:2
1 # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
----> 2 from . import transforms # isort:skip
4 from .build import (
5 build_detection_test_loader,
6 build_detection_train_loader,
(...)
9 print_instances_class_histogram,
10 )
11 from .catalog import DatasetCatalog, MetadataCatalog

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/data/transforms/init.py:2
1 # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
----> 2 from .transform import *
3 from fvcore.transforms.transform import *
4 from .transform_gen import *

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/data/transforms/transform.py:14
9 import pdb
11 all = ["ExtentTransform", "ResizeTransform"]
---> 14 class ExtentTransform(Transform):
15 """
16 Extracts a subregion from the source image and scales it to the output size.
17
(...)
21 See: https://pillow.readthedocs.io/en/latest/PIL.html#PIL.ImageTransform.ExtentTransform
22 """
24 def init(self, src_rect, output_size, interp=Image.LINEAR, fill=0):

File ~/orsted_ai/Multimodal-Object-Detection-via-Probabilistic-Ensembling/detectron2/data/transforms/transform.py:24, in ExtentTransform()
14 class ExtentTransform(Transform):
15 """
16 Extracts a subregion from the source image and scales it to the output size.
17
(...)
21 See: https://pillow.readthedocs.io/en/latest/PIL.html#PIL.ImageTransform.ExtentTransform
22 """
---> 24 def init(self, src_rect, output_size, interp=Image.LINEAR, fill=0):
25 """
26 Args:
27 src_rect (x0, y0, x1, y1): src coordinates
(...)
30 fill: Fill color used when src_rect extends outside image
31 """
32 super().init()

AttributeError: module 'PIL.Image' has no attribute 'LINEAR'`

Is there any updated installation instructions?

Thanks!
HT

How to use pre-trained models for training on new datasets

@Jamie725 Thanks for your work. I am trying to run training on my own dataset for comparisons. I would like to know which pre-trained model I should apply to fine-tune the detector models on my dataset. Should I use Detectron2_pretrained_model/model_final_f6e8b1.pkl or KAIST_xxxx_xxx.pth for model fine-tuning?

Thanks,
Li-Yun

demo

where is the demo.py ? thanks!

关于代码的一些问题

作者你好，我在读detectron2\modeling\meta_arch\rcnn.py这个文件时，发现代码中提取热红外和可见光图片用的是同一个backbone，为什么不是一个用backbone，一个用backbone2呢？

还有一个问题，关于detectron2\modeling\proposal_generator\build.py文件中所提到的“cfg.MODEL.PROPOSAL_GENERATOR.NAME”参数我在configs\Base-RCNN-FPN.yaml没有找到。

jamie725 / multimodal-object-detection-via-probabilistic-ensembling Goto Github PK

multimodal-object-detection-via-probabilistic-ensembling's Introduction

Multimodal Object Detection via Probabilistic Ensembling

Usage

multimodal-object-detection-via-probabilistic-ensembling's People

Contributors

Stargazers

Watchers

Forkers

multimodal-object-detection-via-probabilistic-ensembling's Issues

🚀 Feature

Motivation & Examples

Note

📚 Documentation Issue

🚀 Feature

Motivation & Examples

🚀 Feature

Motivation & Examples

Note

❓ How to use this repo to reproduce the results (demo/train)

Recommend Projects

Recommend Topics

Recommend Org

Jobs