GithubHelp home page GithubHelp logo

dreadlord1984 / diou-ssd-pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 2017tjm/diou-ssd-pytorch

0.0 2.0 0.0 184 KB

Distance-IoU Loss into SSD

License: GNU General Public License v3.0

Python 100.00%

diou-ssd-pytorch's Introduction

DIoU-SSD

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression (AAAI 2020)

[arxiv] [pdf]

SSD_FPN_DIoU,CIoU in PyTorch

The code references SSD: Single Shot MultiBox Object Detector, in PyTorch, mmdet and JavierHuang. Currently, some experiments are carried out on the VOC dataset, if you want to train your own dataset, more details can be refer to the links above.

If you use this work, please consider citing:

@inproceedings{zheng2020distance,
  author    = {Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, Dongwei Ren},
  title     = {Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression},
  booktitle = {The AAAI Conference on Artificial Intelligence (AAAI)},
   year      = {2020},
}

Losses

Losses can be chosen with the losstype option in the config/config.py file The valid options are currently: [Iou|Giou|Diou|Ciou|SmoothL1].

VOC:
  'losstype': 'Ciou'

DIoU-NMS

NMS can be chosen with the nms_kind option in the config/config.py file. If set it to greedynms, it means using greedy-NMS. Besides that, similar to DIoU-NMS in Faster R-CNN, we also introduce beta1 for DIoU-NMS in SSD, that is DIoU = IoU - R_DIoU ^ {beta1}. With this operation, DIoU-NMS may perform better than default beta1=1.0. But for SSD beta1=1.0 seems to be good enough.

  'nms_kind': "diounms"

Fold-Structure

The fold structure as follow:

  • config/
    • config.py
    • init.py
  • data/
    • init.py
    • VOC.py
    • VOCdevkit/
  • model/
    • build_ssd.py
    • init.py
    • backbone/
    • neck/
    • head/
    • utils/
  • utils/
    • box/
    • detection/
    • loss/
    • init.py
  • tools/
    • train.py
    • eval.py
    • test.py
  • work_dir/

Environment

  • pytorch 0.4.1
  • python3+
  • visdom
    • for real-time loss visualization during training!
     pip install visdom
    • Start the server (probably in a screen or tmux)
     python visdom
    • Then (during training) navigate to http://localhost:8097/ (see the Train section below for training details).

Datasets

  • PASCAL VOC:Download VOC2007, VOC2012 dataset, then put VOCdevkit in the data directory

Training

Training VOC

python tools/train.py
  • Note:
    • For training, default NVIDIA GPU.
    • You can set the parameters in the train.py (see 'tools/train.py` for options)
    • In the config,you can set the work_dir to save your training weight.(see 'configs/config.py`)

Evaluation

  • To evaluate a trained network:
python tools/ap.py --trained_model {your_weight_address}

For example: (the output is AP50, AP75 and AP of our CIoU loss)

Results:
0.033
0.015
0.009
0.011
0.008
0.083
0.044
0.042
0.004
0.014
0.026
0.034
0.010
0.006
0.009
0.006
0.009
0.013
0.106
0.011
0.025
~~~~~~~~

--------------------------------------------------------------
Results computed with the **unofficial** Python eval code.
Results should be very close to the official MATLAB eval code.
--------------------------------------------------------------
0.7884902583981603 0.5615516772893671 0.5143832356646468

Test

  • To test a trained network:
python test.py -- trained_model {your_weight_address}

if you want to visual the box, you can add the command --visbox True(default False)

Performance

VOC2007 Test mAP

  • Backbone is ResNet50-FPN:
Test AP AP75
IoU 51.01 54.74
GIoU 51.06 55.48
DIoU 51.31 55.71
CIoU 51.44 56.16

Pretrained weights

Here are the trained models using the configurations in this repository.

diou-ssd-pytorch's People

Contributors

csdwren avatar zzh-tju avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.