GithubHelp home page GithubHelp logo

cosod3k's Introduction

CoSOD3K (CVPR2020)

'Taking a Deeper Look at Co-Salient Object Detection'

Table of Contents

Abstract

Co-salient object detection (CoSOD) is a newly emerging and rapidly growing branch of salient object detection (SOD), which aims to detect the co-occurring salient objects in multiple images. However, existing CoSOD datasets often have a serious data bias, which assumes that each group of images contains salient objects of similar visual appearances. This bias results in the ideal settings and the effectiveness of the models, trained on existing datasets, may be impaired in real-life situations, where the similarity is usually semantic or conceptual. To tackle this issue, we first collect a new high-quality dataset, named CoSOD3k, which contains 3,316 images divided in 160 groups with multiple level annotations, i.e., category, bounding box, object, and instance levels. CoSOD3k makes a significant leap in terms of diversity, difficulty and scalability, benefiting related vision tasks. Besides, we comprehensively summarize 34 cutting-edge algorithms, benchmarking 19 of them over four existing CoSOD datasets (MSRC, iCoSeg, Image Pair and CoSal2015) and our CoSOD3k with a total of ∼61K images (largest scale), and reporting group-level performance analysis. Finally, we discuss the challenge and future work of CoSOD. Our study would give a strong boost to growth in the CoSOD community.

CoSOD Dataset Comparision


Figure 1: Different salient object detection (SOD) tasks. (a) Traditional SOD [78]. (b) Within image co-salient object detection (CoSOD) [93], where common salient objects are detected from a single image. (c) Existing CoSOD, where salient objects are detected according to a pair [52] or a group [85] of images with similar appearances. (d) The proposed CoSOD in the wild, which requires a large amount of semantic context, making it more challenging than existing CoSOD.


Figure 2: The 160 Objects from our CoSOD3k.

Statistics


Table 1: Statistics for size and number of instances/objects in existing datasets.’-’ indicates that the dataset only contains object-level annotations, so, the number of instances is only one.

Downloads

Year Publisher Paper #Image Download Link1 Download Link2
2005 ICCV MSRC 233 Baidu Pan: 8r27 Google (4.17M)
2010 CVPR iCoSeg 643 Baidu Pan: e1mz Google (67M)
2011 TIP Image Pair 105 Baidu Pan: fmqj Google (0.98M)
2016 IJCV/CVPR CoSal2015 2015 Baidu Pan: kpvv Google (96.1M)
2018 AAAI WICOS 364 Baidu Pan: b5qg Google (10.7M)
2020 ECCV CoCA 1295 Baidu Pan: ckzt Google (96M)
2020 CVPR CoSOD3k 3316 Baidu Pan: 65as Google (418M) + Google (411M)
Overall Baidu Pan: 6mvn Google (1.4G)

SOTA Models

News

Model Pub. Year #Training Training set Main Component SL. Sp. Po. Ed. Post.
WPLT UIST 2010 Morphological, Translational Alignment U
PCSDT ICIP 2010 120,000 8*8 image patch Sparse feature, Filter Bank W
IPCST TIP 2011 Ncut, co-multilayer Graph U
CBCST TIP 2013 Contrast/Spatial/Corresponding Cue U
MIT TMM 2013 Feature/Images Pyramid, Multi-scale Voting U GCut
CSHST SPL 2013 Hierarchical Segmentation, Contour Map U
ESMGT SPL 2014 Efficient Manifold Ranking 184], OTSU U
BRT MM 2014 Common/Center Cue, Global Correspondence U
SACST TIP 2014 Self-adaptive Weight, Low Rank Matrix U
DIM TNNLS 2015 1,000+9,963 ASD+PV SDAE model, Contrast/Object Prior S
CODW IJCV 2016 ImageNet pre-train SermaNet, RBM, IMC, IGS, IGC W
SP-MIL TPAMI 2017 (240+643)•10% MSRC-V1+iCoseg SPL 1971, SVM, GIST 1691, CNNs W
GD IJCAI 2017 9,213 MSCOCO VGGNet16 [681, Group-wise Feature S
MVSRCC TIP 2017 LBP, SIFT [611, CH, Bipartite Graph
UMLF TCSVT 2017 (240+2015)*50% MSRC-V1 + CoSa12015 SVM, OMR 186], metric teaming S
DML BMVC 2018 10,000+6,232+ 5,168 MIOK+THUR-15K 1111 +DO CAE, HSR, Multistage S
DWSI AAAI 2018 EdgeBox [106], Low-rank Matrix, CH S
GONet ECCV 2018 ImageNet pre-train ResNet-50 [281, Graphical Optimization W CRF
COC IJCAI 2018 ImageNet pre-train ResNet-50 [281, Co-attention Loss W CRF
FASS MM 2018 ImageNet pre-train DHS 156]/VGGNet. Graph optimization W
PJOT TIP 2018 Energy Minimization, BoWs U
SPIG TIP 2018 10,000+210+2,015+240 MIOK+IPCS+CoSal2015+ MSRC-V DeepLab, Graph Representation S
QGF TMM 2018 ImageNet pre-traln Dense Correspondence, Quality Measure S THR
EHL NC 2019 643 iCoseg GoogLeNet, FSM S
IML NC 2019 3624 CoSa12015+PV+CR VGGNet16 S
DGFC TIP 2019 >200,000 MSCOCO 1551 VGGNet16, Group-wise Feature S
RCANet IJCAI 2019 >200,000 MSCOCO+COS+iCoseg+ CoSa12015+MSRC VGGNet16, Recurrent Units S THR
GS AAAI 2019 200,000 COCO-SEG VGGNet19, Co-category Classification S
MGCNet ICME 2019 Graph Convolutional Networks S
MGLCN MM 2019 N/A N/A VGGNet16, PiCANet, Inter-/Intra-graph S
HC MM 2019 N/A N/A VAE-Net, Hierarchical Consistency S CRF
CSMG CVPR 2019 25,00 MB VGGNet16, Shared Superpixel Feature S
DeepCO3 CVPR 2019 10,000 MIOK SVFSaI / VGGNet, Co-peak Search W
GWD ICCV 2019 >200,000 MSCOCO VGGNet19 , RNN, Group-wise Loss S THR
GCAGC CVPR 2020 >200,000 COCO-SEG Graph Model S
GICD ECCV 2020 8,250 DUTS_class Gradient S
ICNet NeurIPS 2020 9,213 COCO-9k External SOD Model S
CoADNet NeurIPS 2020 >200,000 DUTS_class+COCO-SEG Group Mining S
CoSformer arXiv 2021 >200,000 DUTS_class+COCO-SEG Transformer S
CoEG-Net TPAMI 2021 8,250 DUTS_class PCA S
DeepACG CVPR 2021 >200,000 COCO-SEG Gromov-Wasserstein Distance S
GCoNet CVPR 2021 8,250 DUTS_class Group Collaborative Learning S
CADC ICCV 2021 8,250+9,213 DUTS_class+COCO-9k Dynamic Convolution S
DCFM CVPR 2022 9,213 COCO-9k Prototype, self-contrastive learning S
UFO arXiv 2022 >200,000 COCO-SEG Transformer S
GCoNet+ arXiv 2022 >200,000 DUTS_class, COCO-9k, COCO-SEG Inter-group Learning, Metric Learning S

WPLT means the WPL is a traditional method, instead of a deep method.

Results

SOTA:

Refer to the CoSOD task in papers-with-code.

Predicted Maps

Model Baidu Pan Google Drive
CBCS Baidu-Disk (gtse) Google-Drive
CODR Baidu-Disk (qfks) Google-Drive
CPD Baidu-Disk (jxkk) Google-Drive
CSHS Baidu-Disk (wda4) Google-Drive
CSMG Baidu-Disk (gwm6) Google-Drive
DIM Baidu-Disk (2hgk) Google-Drive
EGNet Baidu-Disk (tkna) Google-Drive
ESMG Baidu-Disk (hxqb) Google-Drive
IML Baidu-Disk (7m1c) Google-Drive
UMLF Baidu-Disk (eqpw) Google-Drive
GCAGC Baidu-Disk (ij29) Google-Drive
GICD Baidu-Disk (puji) Google-Drive
ICNet Baidu-Disk (xwcv) Google-Drive
CoADNet Baidu-Disk (MVPL)
Co-EGNet Baidu-Disk (f4p3) Google-Drive
GCoNet Google-Drive
CADC Baidu-Disk (i59u) Google-Drive
DCFM Google-Drive
UFO Google-Drive
GCoNet+ Google-Drive

Qualitative Results


Figure 3: Qualitative examples of existing top-10 models on CoSOD3k.

Citation

If you find this useful, please cite the following work:

@inproceedings{fan2020taking,   
  title={Taking a Deeper Look at the Co-salient Object Detection}, 
  author={Fan, Deng-Ping and Lin, Zheng and Ji, Ge-Peng and Zhang, Dingwen and Fu, Huazhu and Cheng, Ming-Ming},   
  booktitle={IEEE CVPR}, 
  year={2020} 
} 

@article{fan2022re,
  title={Re-thinking co-salient object detection},
  author={Fan, Deng-Ping and Li, Tengpeng and Lin, Zheng and Ji, Ge-Peng and Zhang, Dingwen and Cheng, Ming-Ming and Fu, Huazhu and Shen, Jianbing},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={8},
  pages={4339-4354},
  year={2022},
  publisher={IEEE}
}

cosod3k's People

Contributors

dengpingfan avatar zhengpeng7 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.