ResT

By Qing-Long Zhang and Yu-Bin Yang

[State Key Laboratory for Novel Software Technology at Nanjing University]

This repo is the official implementation of "ResT: An Efficient Transformer for Visual Recognition". It currently includes code and models for the following tasks:

Image Classification: Included in this repo. See get_started.md for a quick start.

Object Detection and Instance Segmentation: Based on detectron2.

ResT is initially described in arxiv, which capably serves as a general-purpose backbone for computer vision. It can tackle input images with arbitrary size. Besides, ResT compressed the memory of standard MSA and model the interaction between multi-heads while keeping the diversity ability.

Main Results on ImageNet with Pretrained Models

ImageNet-1K Pretrained Models

name	resolution	acc@1	acc@5	#params	FLOPs	FPS	1K model
ResT-Lite	224x224	77.2	93.7	10.5M	1.4G	1246	baidu
ResT-Small	224x224	79.6	94.9	13.7M	1.9G	1043	baidu
ResT-Base	224x224	81.6	95.7	30.3M	4.3G	673	baidu
ResT-Large	224x224	83.6	96.3	51.6M	7.9G	429	baidu

Note: access code for baidu is rest. pretrained models for google drive.

Main Results on Downstream Tasks

COCO Object Detection (2017 val)

Attention: The results of downstream tasks are very sensitive to the training settings, a fluctuation of 1.0 is OK!

Backbone	Method	pretrain	Lr Schd	box mAP	mask mAP	#params	model
ResT-Small	RetinaNet	ImageNet-1K	1x	40.3	-	23.4M	baidu
ResT-Base	RetinaNet	ImageNet-1K	1x	42.0	-	40.5M	baidu
ResT-Small	Mask R-CNN	ImageNet-1K	1x	39.6	37.2	33.3M	baidu
ResT-Base	Mask R-CNN	ImageNet-1K	1x	41.6	38.7	49.8M	baidu

Note: This is the results with LN (Comparison is shown in the Appendix part). For training with rest backbones, you need to convert the original pre-trained weights to d2 format by convert_to_d2.py. Access code for baidu is rest.

Citing ResT

@article{zhql2021ResT,
  title={ResT: An Efficient Transformer for Visual Recognition},
  author={Zhang, Qinglong and Yang, Yubin},
  journal={arXiv preprint arXiv:2105.13677v3},
  year={2021}
}

cyhuauin / rest Goto Github PK

rest's Introduction

ResT

Main Results on ImageNet with Pretrained Models

Main Results on Downstream Tasks

Citing ResT

rest's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs