GithubHelp home page GithubHelp logo

hechang25 / dnc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chenghan111/dnc

0.0 0.0 0.0 5.15 MB

Official Pytorch implementation of 'Visual Recognition with Deep Nearest Centroids'. (ICLR2023 Spotlight)

License: MIT License

Shell 0.16% Python 99.84%

dnc's Introduction

Python 3.7

Visual Recognition with Deep Nearest Centroids (ICLR2023-Spotlight)

Figure 1: With a distance-/case-based classification scheme, DNC combines unsupervised sub-pattern discovery and supervised representation learning in a synergy.

Visual Recognition with Deep Nearest Centroids,
Wenguan Wang, Cheng Han, Tianfei Zhou, Dongfang Liu
ICLR 2023 (Spotlight) (arXiv 2209.07383)

This repository is the official Pytorch implementation of training & evaluation code and corresponding pretrained models for DNC.

We use MMClassification v0.18.0 and MMSegmentation v0.20.2 as the codebase.

Abstract

We devise deep nearest centroids (DNC), a conceptually elegant yet surprisingly effective network for large-scale visual recognition, by revisiting Nearest Centroids, one of the most classic and simple classifiers. Current deep models learn the classifier in a fully parametric manner, ignoring the latent data structure and lacking simplicity and explainability. DNC instead conducts nonparametric, case-based reasoning; it utilizes sub-centroids of training samples to describe class distributions and clearly explains the classification as the proximity of test data and the class sub-centroids in the feature space. Due to the distance-based nature, the network output dimensionality is flexible, and all the learnable parameters are only for data embedding. That means all the knowledge learnt for ImageNet classification can be completely transferred for pixel recognition learning, under the ‘pre-training and fine-tuning’ paradigm. Apart from its nested simplicity and intuitive decision-making mechanism, DNC can even possess ad-hoc explainability when the sub-centroids are selected as actual training images that humans can view and inspect. Compared with parametric counterparts, DNC performs better on image classification (CIFAR-10, ImageNet) and greatly boots pixel recognition (ADE20K, Cityscapes), with improved transparency and fewer learnable parameters, using various network architectures (ResNet, Swin) and segmentation models (FCN, DeepLabV3, Swin). We feel this work brings fundamental insights into related fields.

Installation

For installation and data preparation, please refer to the guidelines in MMClassification v0.18.0 and MMSegmentation v0.20.2.

I do my installation on CUDA 11.4 and pytorch 1.8.1

pip install torchvision==0.9.1
pip install timm==0.3.2
pip install mmcv-full==1.4.1
pip install opencv-python==4.5.1.48
cd DNC_classification && pip install -e . --user

Training

To train your own model, please apply the following command. Give ResNet50-ImageNet as an example.

sh ./tools/dist_train.sh configs/resnet/resnet50_8xb32_in1k_centroids.py 8 \
  --work-dir SCRATCH_DIR 

More general case:

sh ./tools/dist_train.sh configs/(resnet/swin_transformer)/xxxxxx.py 8 \
  --work-dir SCRATCH_DIR

Testing

Download trained weights

# Single-gpu testing
pip list | grep "mmcv\|mmcls\|^torch"
python tools/test.py local_config_file.py model.pth --out result.pkl --metrics accuracy

Citation

If you find our work helpful in your research, please cite it as:

@inproceedings{wang2023visual,
  title={Visual recognition with deep nearest centroids},
  author={Wang, Wenguan and Han, Cheng and Zhou, Tianfei and Liu, Dongfang},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2023}
}

dnc's People

Contributors

chenghan111 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.