vit-attentive-bench's Introduction

Vision Transformer Attention Mechanism Benchmark

This repository, maintained by woodminus, is a comprehensive benchmarking of various attention mechanisms used in Vision Transformers. It not only provides a re-implementation but also furnishes a performance benchmark on parameters, FLOPs and CPU/GPU throughput of different attention mechanisms.

Requirements

Pytorch 1.8+
timm
ninja
einops
fvcore
matplotlib

Testing Environment

NVIDIA RTX 3090
Intel® Core™ i9-10900X CPU @ 3.70GHz
Memory 32GB
Ubuntu 22.04
PyTorch 1.8.1 + CUDA 11.1

Setting

input: 14 x 14 = 196 tokens (1/16 scale feature maps in common ImageNet-1K training)
batch size for speed testing (images/s): 64
embedding dimension:768
number of heads: 12

Testing

For example, to test HiLo attention,

cd attentions/
python hilo.py

By default, the script will test models on both CPU and GPU. FLOPs is measured by fvcore. You may want to edit the source file as needed.

Outputs:

Number of Params: 2.2 M
FLOPs = 298.3 M
throughput averaged with 30 times
batch_size 64 throughput on CPU 1029
throughput averaged with 30 times
batch_size 64 throughput on GPU 5104

Supported Attentions

Numerous attention mechanisms along with their respective papers and codes.

Single Attention Layer Benchmark

Various attention mechanisms along with their respective computational parameters and speed.

Note: Each method has its own hyperparameters. For a fair comparison on 1/16 scale feature maps, all methods in the above table adopt their default 1/16 scale settings, as shown in their released code repo. For example, when dealing with 1/16 scale feature maps, HiLo in LITv2 adopt a window size of 2 and alpha of 0.9. Future works will consider more scales and memory benchmarking.

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Recommend Projects

woodminus / vit-attentive-bench Goto Github PK

vit-attentive-bench's Introduction

Vision Transformer Attention Mechanism Benchmark

Requirements

Testing Environment

Setting

Testing

Supported Attentions

Single Attention Layer Benchmark

License

vit-attentive-bench's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs