GithubHelp home page GithubHelp logo

unanan / as-mlp-object-detection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from svip-lab/as-mlp-object-detection

0.0 0.0 0.0 20.37 MB

[ICLR'22] This is an official implementation for "AS-MLP: An Axial Shifted MLP Architecture for Vision" on Object Detection and Instance Segmentation.

Home Page: https://arxiv.org/pdf/2107.08391.pdf

License: Apache License 2.0

Shell 0.08% Python 99.83% Dockerfile 0.09%

as-mlp-object-detection's Introduction

AS-MLP for Object Detection

This repo contains the supported code and configuration files to reproduce object detection results of AS-MLP. It is based on Swin Transformer.

Results and Models

Mask R-CNN

Backbone Pretrain Lr Schd box mAP mask mAP Params FLOPs config model
AS-MLP-T ImageNet-1K 1x 44.0 40.0 48M 260G config onedrive
AS-MLP-T ImageNet-1K 3x 46.0 41.5 48M 260G config
AS-MLP-S ImageNet-1K 1x 46.7 42.0 69M 346G config
AS-MLP-S ImageNet-1K 3x 47.8 42.9 69M 346G config

Cascade Mask R-CNN

Backbone Pretrain Lr Schd box mAP mask mAP Params FLOPs config model
AS-MLP-T ImageNet-1K 1x 48.4 42.0 86M 739G config onedrive
AS-MLP-T ImageNet-1K 3x 50.1 43.5 86M 739G config
AS-MLP-S ImageNet-1K 1x 50.5 43.7 107M 824G config
AS-MLP-S ImageNet-1K 3x 51.1 44.2 107M 824G config
AS-MLP-B ImageNet-1K 1x 51.1 44.2 145M 961G config
AS-MLP-B ImageNet-1K 3x 51.5 44.7 145M 961G config

Notes:

Usage

Installation

Please refer to get_started.md for installation and dataset preparation.

Inference

# single-gpu testing
python tools/test.py <CONFIG_FILE> <DET_CHECKPOINT_FILE> --eval bbox segm

# multi-gpu testing
tools/dist_test.sh <CONFIG_FILE> <DET_CHECKPOINT_FILE> <GPU_NUM> --eval bbox segm

Training

To train a detector with pre-trained models, run:

# single-gpu training
python tools/train.py <CONFIG_FILE> --cfg-options model.pretrained=<PRETRAIN_MODEL> [model.backbone.use_checkpoint=True] [other optional arguments]

# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> --cfg-options model.pretrained=<PRETRAIN_MODEL> [model.backbone.use_checkpoint=True] [other optional arguments] 

For example, to train a Mask R-CNN model with a AS-MLP-T backbone and 8 gpus, run:

tools/dist_train.sh configs/asmlp/mask_rcnn_asmlp_tiny_patch4_shift5_mstrain_480-800_adamw_3x_coco.py 8 --cfg-options model.pretrained=<PRETRAIN_MODEL> 

Note: use_checkpoint is used to save GPU memory. Please refer to this page for more details.

Apex (optional):

We use apex for mixed precision training by default. To install apex, run:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Citation

@article{Lian_2021_ASMLP,
  author = {Lian, Dongze and Yu, Zehao and Sun, Xing and Gao, Shenghua},
  title = {AS-MLP: An Axial Shifted MLP Architecture for Vision},
  journal={ICLR},
  year = {2022}
}

Other Links

Image Classification: See AS-MLP for Image Classification.

as-mlp-object-detection's People

Contributors

aemikachow avatar chrisfsj2051 avatar daavoo avatar dongzelian avatar erotemic avatar gt9505 avatar hellock avatar hhaandroid avatar impiga avatar innerlee avatar johnson-wang avatar jshilong avatar korabelnikov avatar lindahua avatar melikovk avatar mxbonn avatar myownskyw7 avatar oceanpang avatar runningleon avatar ryanxli avatar shinya7y avatar thangvubk avatar tianyuandu avatar v-qjqs avatar wangruohui avatar wswday avatar xvjiarui avatar yhcao6 avatar yuzhj avatar zwwwayne avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.