CoSOD3K (CVPR2020)

'Taking a Deeper Look at Co-Salient Object Detection'

CoSOD3K (CVPR2020)

Abstract

Co-salient object detection (CoSOD) is a newly emerging and rapidly growing branch of salient object detection (SOD), which aims to detect the co-occurring salient objects in multiple images. However, existing CoSOD datasets often have a serious data bias, which assumes that each group of images contains salient objects of similar visual appearances. This bias results in the ideal settings and the effectiveness of the models, trained on existing datasets, may be impaired in real-life situations, where the similarity is usually semantic or conceptual. To tackle this issue, we first collect a new high-quality dataset, named CoSOD3k, which contains 3,316 images divided in 160 groups with multiple level annotations, i.e., category, bounding box, object, and instance levels. CoSOD3k makes a significant leap in terms of diversity, difficulty and scalability, benefiting related vision tasks. Besides, we comprehensively summarize 34 cutting-edge algorithms, benchmarking 19 of them over four existing CoSOD datasets (MSRC, iCoSeg, Image Pair and CoSal2015) and our CoSOD3k with a total of ∼61K images (largest scale), and reporting group-level performance analysis. Finally, we discuss the challenge and future work of CoSOD. Our study would give a strong boost to growth in the CoSOD community.

CoSOD Dataset Comparision

Figure 1: Different salient object detection (SOD) tasks. (a) Traditional SOD [78]. (b) Within image co-salient object detection (CoSOD) [93], where common salient objects are detected from a single image. (c) Existing CoSOD, where salient objects are detected according to a pair [52] or a group [85] of images with similar appearances. (d) The proposed CoSOD in the wild, which requires a large amount of semantic context, making it more challenging than existing CoSOD.

Figure 2: The 160 Objects from our CoSOD3k.

Statistics

Table 1: Statistics for size and number of instances/objects in existing datasets.’-’ indicates that the dataset only contains object-level annotations, so, the number of instances is only one.

Downloads

Year	Publisher	Paper	#Image	Download Link1	Download Link2
2005	ICCV	MSRC	233	Baidu Pan: 8r27	Google (4.17M)
2010	CVPR	iCoSeg	643	Baidu Pan: e1mz	Google (67M)
2011	TIP	Image Pair	105	Baidu Pan: fmqj	Google (0.98M)
2016	IJCV/CVPR	CoSal2015	2015	Baidu Pan: kpvv	Google (96.1M)
2018	AAAI	WICOS	364	Baidu Pan: b5qg	Google (10.7M)
2020	ECCV	CoCA	1295	Baidu Pan: ckzt	Google (96M)
2020	CVPR	CoSOD3k	3316	Baidu Pan: 65as	Google (418M) + Google (411M)
Overall				Baidu Pan: 6mvn	Google (1.4G)

SOTA Models

News

Model	Pub.	Year	#Training	Training set	Main Component	SL.	Sp.	Po.	Ed.	Post.
WPL^T	UIST	2010			Morphological, Translational Alignment	U
PCSD^T	ICIP	2010	120,000	8*8 image patch	Sparse feature, Filter Bank	W
IPCS^T	TIP	2011			Ncut, co-multilayer Graph	U	√
CBCS^T	TIP	2013			Contrast/Spatial/Corresponding Cue	U
MI^T	TMM	2013			Feature/Images Pyramid, Multi-scale Voting	U	√			GCut
CSHS^T	SPL	2013			Hierarchical Segmentation, Contour Map	U			√
ESMG^T	SPL	2014			Efficient Manifold Ranking 184], OTSU	U
BR^T	MM	2014			Common/Center Cue, Global Correspondence	U	√
SACS^T	TIP	2014			Self-adaptive Weight, Low Rank Matrix	U	√
DIM	TNNLS	2015	1,000+9,963	ASD+PV	SDAE model, Contrast/Object Prior	S	√
CODW	IJCV	2016		ImageNet pre-train	SermaNet, RBM, IMC, IGS, IGC	W	√	√
SP-MIL	TPAMI	2017	(240+643)•10%	MSRC-V1+iCoseg	SPL 1971, SVM, GIST 1691, CNNs	W	√
GD	IJCAI	2017	9,213	MSCOCO	VGGNet16 [681, Group-wise Feature	S
MVSRCC	TIP	2017			LBP, SIFT [611, CH, Bipartite Graph		√	√
UMLF	TCSVT	2017	(240+2015)*50%	MSRC-V1 + CoSa12015	SVM, OMR 186], metric teaming	S	√
DML	BMVC	2018	10,000+6,232+ 5,168	MIOK+THUR-15K 1111 +DO	CAE, HSR, Multistage	S
DWSI	AAAI	2018			EdgeBox [106], Low-rank Matrix, CH	S		√
GONet	ECCV	2018		ImageNet pre-train	ResNet-50 [281, Graphical Optimization	W	√			CRF
COC	IJCAI	2018		ImageNet pre-train	ResNet-50 [281, Co-attention Loss	W		√		CRF
FASS	MM	2018		ImageNet pre-train	DHS 156]/VGGNet. Graph optimization	W	√
PJO^T	TIP	2018			Energy Minimization, BoWs	U	√
SPIG	TIP	2018	10,000+210+2,015+240	MIOK+IPCS+CoSal2015+ MSRC-V	DeepLab, Graph Representation	S	√
QGF	TMM	2018		ImageNet pre-traln	Dense Correspondence, Quality Measure	S	√			THR
EHL	NC	2019	643	iCoseg	GoogLeNet, FSM	S	√
IML	NC	2019	3624	CoSa12015+PV+CR	VGGNet16	S	√
DGFC	TIP	2019	>200,000	MSCOCO 1551	VGGNet16, Group-wise Feature	S	√
RCANet	IJCAI	2019	>200,000	MSCOCO+COS+iCoseg+ CoSa12015+MSRC	VGGNet16, Recurrent Units	S				THR
GS	AAAI	2019	200,000	COCO-SEG	VGGNet19, Co-category Classification	S
MGCNet	ICME	2019			Graph Convolutional Networks	S	√
MGLCN	MM	2019	N/A	N/A	VGGNet16, PiCANet, Inter-/Intra-graph	S	√
HC	MM	2019	N/A	N/A	VAE-Net, Hierarchical Consistency	S	√	√		CRF
CSMG	CVPR	2019	25,00	MB	VGGNet16, Shared Superpixel Feature	S	√
DeepCO3	CVPR	2019	10,000	MIOK	SVFSaI / VGGNet, Co-peak Search	W		√
GWD	ICCV	2019	>200,000	MSCOCO	VGGNet19 , RNN, Group-wise Loss	S				THR
GCAGC	CVPR	2020	>200,000	COCO-SEG	Graph Model	S
GICD	ECCV	2020	8,250	DUTS_class	Gradient	S
ICNet	NeurIPS	2020	9,213	COCO-9k	External SOD Model	S
CoADNet	NeurIPS	2020	>200,000	DUTS_class+COCO-SEG	Group Mining	S
CoSformer	arXiv	2021	>200,000	DUTS_class+COCO-SEG	Transformer	S
CoEG-Net	TPAMI	2021	8,250	DUTS_class	PCA	S
DeepACG	CVPR	2021	>200,000	COCO-SEG	Gromov-Wasserstein Distance	S
GCoNet	CVPR	2021	8,250	DUTS_class	Group Collaborative Learning	S
CADC	ICCV	2021	8,250+9,213	DUTS_class+COCO-9k	Dynamic Convolution	S
DCFM	CVPR	2022	9,213	COCO-9k	Prototype, self-contrastive learning	S
UFO	arXiv	2022	>200,000	COCO-SEG	Transformer	S
GCoNet+	arXiv	2022	>200,000	DUTS_class, COCO-9k, COCO-SEG	Inter-group Learning, Metric Learning	S

WPL^T means the WPL is a traditional method, instead of a deep method.

Results

SOTA:

Refer to the CoSOD task in papers-with-code.

Predicted Maps

Model	Baidu Pan	Google Drive
CBCS	Baidu-Disk (gtse)	Google-Drive
CODR	Baidu-Disk (qfks)	Google-Drive
CPD	Baidu-Disk (jxkk)	Google-Drive
CSHS	Baidu-Disk (wda4)	Google-Drive
CSMG	Baidu-Disk (gwm6)	Google-Drive
DIM	Baidu-Disk (2hgk)	Google-Drive
EGNet	Baidu-Disk (tkna)	Google-Drive
ESMG	Baidu-Disk (hxqb)	Google-Drive
IML	Baidu-Disk (7m1c)	Google-Drive
UMLF	Baidu-Disk (eqpw)	Google-Drive
GCAGC	Baidu-Disk (ij29)	Google-Drive
GICD	Baidu-Disk (puji)	Google-Drive
ICNet	Baidu-Disk (xwcv)	Google-Drive
CoADNet	Baidu-Disk (MVPL)
Co-EGNet	Baidu-Disk (f4p3)	Google-Drive
GCoNet		Google-Drive
CADC	Baidu-Disk (i59u)	Google-Drive
DCFM		Google-Drive
UFO		Google-Drive
GCoNet+		Google-Drive

Qualitative Results

Figure 3: Qualitative examples of existing top-10 models on CoSOD3k.

Citation

If you find this useful, please cite the following work:

@inproceedings{fan2020taking,   
  title={Taking a Deeper Look at the Co-salient Object Detection}, 
  author={Fan, Deng-Ping and Lin, Zheng and Ji, Ge-Peng and Zhang, Dingwen and Fu, Huazhu and Cheng, Ming-Ming},   
  booktitle={IEEE CVPR}, 
  year={2020} 
} 

@article{fan2022re,
  title={Re-thinking co-salient object detection},
  author={Fan, Deng-Ping and Li, Tengpeng and Lin, Zheng and Ji, Ge-Peng and Zhang, Dingwen and Cheng, Ming-Ming and Fu, Huazhu and Shen, Jianbing},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={8},
  pages={4339-4354},
  year={2022},
  publisher={IEEE}
}

lyf0801 / cosod3k Goto Github PK

cosod3k's Introduction

CoSOD3K (CVPR2020)

Table of Contents

Abstract

CoSOD Dataset Comparision

Statistics

Downloads

SOTA Models

News

Results

SOTA:

Predicted Maps

Qualitative Results

Citation

cosod3k's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs