GithubHelp home page GithubHelp logo

templeblock / tresnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alibaba-miil/tresnet

0.0 1.0 0.0 420 KB

TResNet: High Performance GPU-Dedicated Architecture

License: Apache License 2.0

Python 100.00%

tresnet's Introduction

TResNet: High Performance GPU-Dedicated Architecture

paper | pretrained models

Official PyTorch Implementation

Tal Ridnik, Hussam Lawen, Asaf Noy, Itamar Friedman
DAMO Academy, Alibaba Group

Abstract

Many deep learning models, developed in recent years, reach higher ImageNet accuracy than ResNet50, with fewer or comparable FLOPS count. While FLOPs are often seen as a proxy for network efficiency, when measuring actual GPU training and inference throughput, vanilla ResNet50 is usually significantly faster than its recent competitors, offering better throughput-accuracy trade-off. In this work, we introduce a series of architecture modifications that aim to boost neural networks' accuracy, while retaining their GPU training and inference efficiency. We first demonstrate and discuss the bottlenecks induced by FLOPs-optimizations. We then suggest alternative designs that better utilize GPU structure and assets. Finally, we introduce a new family of GPU-dedicated models, called TResNet, which achieve better accuracy and efficiency than previous ConvNets. Using a TResNet model, with similar GPU throughput to ResNet50, we reach 80.7% top-1 accuracy on ImageNet. Our TResNet models also transfer well and achieve state-of-the-art accuracy on competitive datasets such as Stanford cars (96.0%), CIFAR-10 (99.0%), CIFAR-100 (91.5%) and Oxford-Flowers (99.1%)

Main Results

TResNet Models

TResNet models accuracy and GPU throughput on ImageNet, compared to ResNet50. All measurements were done on Nvidia V100 GPU, with mixed precision. All models are trained on input resolution of 224.

Models Top Training Speed
(img/sec)
Top Inference Speed
(img/sec)
Max Train Batch Size Top-1 Acc.
ResNet50 805 2830 288 79.0
EfficientNetB1 440 2740 196 79.2
TResNet-M 730 2930 512 80.7
TResNet-L 345 1390 316 81.4
TResNet-XL 250 1060 240 82.0

Comparison To Other Networks

Comparison of ResNet50 to top modern networks, with similar top-1 ImageNet accuracy. All measurements were done on Nvidia V100 GPU with mixed precision. For gaining optimal speeds, training and inference were measured on 90% of maximal possible batch size. Except TResNet-M, all the models' ImageNet scores were taken from the public repository, which specialized in providing top implementations for modern networks. Except EfficientNet-B1, which has input resolution of 240, all other models have input resolution of 224.

Model Top Training Speed
(img/sec)
Top Inference Speed
(img/sec)
Top-1 Acc. Flops[G]
ResNet50 805 2830 79.0 4.1
ResNet50-D 600 2670 79.3 4.4
ResNeXt50 490 1940 78.5 4.3
EfficientNetB1 440 2740 79.2 0.6
SEResNeXt50 400 1770 79.0 4.3
MixNet-L 400 1400 79.0 0.5
TResNet-M 730 2930 80.7 5.5


Transfer Learning SotA Results

Comparison of TResNet to state-of-the-art models on transfer learning datasets (only ImageNet-based transfer learning results). Models inference speed is measured on a mixed precision V100 GPU. Since no official implementation of Gpipe was provided, its inference speed is unknown

Dataset Model Top-1
Acc.
Speed
img/sec
Input
CIFAR-10 Gpipe 99.0 - 480
TResNet-XL 99.0 1060 224
CIFAR-100 EfficientNet-B7 91.7 70 600
TResNet-XL 91.5 1060 224
Stanford Cars EfficientNet-B7 94.7 70 600
TResNet-L 96.0 500 368
Oxford-Flowers EfficientNet-B7 98.8 70 600
TResNet-L 99.1 500 368

Reproduce Results

We provide code for reproducing the validation top-1 score of TResNet models on ImageNet (input resolution 224). First, download pretrained models from here.

Then, run the infer.py script. For example, for tresnet_m run:

python -m infer.py \
--val_dir=/path/to/imagenet_val_folder \
--model_path=/model/path/to/tresnet_m.pth \
--model_name=tresnet_m

Citation

@misc{ridnik2020tresnet,
    title={TResNet: High Performance GPU-Dedicated Architecture},
    author={Tal Ridnik and Hussam Lawen and Asaf Noy and Itamar Friedman},
    year={2020},
    eprint={2003.13630},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Contact

Feel free to contact me if there are any questions or issues (Tal Ridnik, [email protected]).

tresnet's People

Contributors

mrt23 avatar hussam789 avatar

Watchers

paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.