GithubHelp home page GithubHelp logo

cv-ip / rotated-ld Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zzh-tju/rotated-ld

0.0 0.0 0.0 2.01 MB

Rotated Localization Distillation

License: Apache License 2.0

Shell 0.17% Python 99.75% Dockerfile 0.08%

rotated-ld's Introduction

Localization Distillation for Object Detection

English | 简体中文

LD for horizontal bbox object detector is available at https://github.com/HikariTJU/LD.

This repo is based on MMRotate.

Analysis of LD in ZhiHu: 目标检测-定位蒸馏 (LD, CVPR 2022) and 目标检测-定位蒸馏续集——logit蒸馏与feature蒸馏之争

This is the code for our paper:

@Article{zheng2022rotatedLD,
  title={Localization Distillation for Object Detection},
  author= {Zheng, Zhaohui and Ye, Rongguang and Hou, Qibin and Ren, Dongwei and Wang, Ping and Zuo, Wangmeng and Cheng, Ming-Ming},
  journal={arXiv preprint arXiv:2204.05957},
  year={2022}
}

[2021.3.30] LD is officially included in MMDetection V2, many thanks to @jshilong , @Johnson-Wang and @ZwwWayne for helping migrating the code.

LD is the extension of knowledge distillation on localization task, which utilizes the learned bbox distributions to transfer the localization dark knowledge from teacher to student.

LD stably improves over rotated detectors without adding any computational cost!

Introduction

Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the classification logits due to its inefficiency in distilling the localization information. In this paper, we investigate whether logit mimicking always lags behind feature imitation. Towards this goal, we first present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student. Second, we introduce the concept of valuable localization region that can aid to selectively distill the classification and localization knowledge for a certain region. Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years. The thorough studies exhibit the great potential of logit mimicking that can significantly alleviate the localization ambiguity, learn robust feature representation, and ease the training difficulty in the early stage. We also provide the theoretical connection between the proposed LD and the classification KD, that they share the equivalent optimization effect. Our distillation scheme is simple as well as effective and can be easily applied to both dense horizontal object detectors and rotated object detectors. Extensive experiments on the MS COCO, PASCAL VOC, and DOTA benchmarks demonstrate that our method can achieve considerable AP improvement without any sacrifice on the inference speed.

Installation

Please refer to INSTALL.md for installation and dataset preparation. Pytorch=1.5.1 and cudatoolkits=10.1 are recommended.

Get Started

Please see GETTING_STARTED.md for the basic usage of MMDetection.

Data Preparation

Please refer to data_preparation.md to prepare the data.

Evaluation Tool

Move the file tests/val_set.txt to /yourpath/dataset/DOTAv1/.

Download https://github.com/CAPTAIN-WHU/DOTA_devkit, which is an official evaluation tool for DOTA.

Replace dota_evaluation_task1.py with our dota_evaluation_task1.py.

Open dota_evaluation_task1.py and modify detpath, annopath and imagesetfile to your own path.

After running the test, run

python yourpath/DOTA_devkit-master/dota_evaluation_task1.py

AP, AP50, AP55, ... , AP95 will be printed in the terminal.

Convert model

If you find trained model very large, please refer to publish_model.py

python tools/model_converters/publish_model.py your_model.pth your_new_model.pth

Evaluation Results

DOTA-1.0 val

Rotated-RetinaNet, LD + KD

Teacher Student Training schedule AP AP50 AP70 AP90 download
-- R-18 1x 33.7 58.0 42.3 4.7
R-34 R-18 1x 39.1 63.8 48.8 8.8 model

GWD, LD + KD

Teacher Student Training schedule AP AP50 AP70 AP90 download
-- R-18 1x 37.1 63.1 46.7 6.2
R-34 R-18 1x 40.2 66.4 50.3 8.5 model

Note:

  • Teacher detector adopts 2x training schedule (24 epochs), student detector adopts 1x (12 epochs)。We use DOTA-v1.0 train set for training, and val set for evaluation。

  • Number of GPU is 2, mini batchsize is 1 per GPU。We found that even though the batchsize was fixed, single GPU training produced higher AP than double GPUs training.

  • On DOTA, we found LD and classification KD are equally important, which can improve the baseline (such as R-RetinaNet) by more than 3.5 AP. And using the combination of LD and KD reaches the highest.

Acknowledgments

Thank you to yangxue0827 for his help of data preparation and his exellent works for rotated object detection.

rotated-ld's People

Contributors

zzh-tju avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.