ToMato: Token Merging at Once

About our code

ViT(Vision Transformer) shows outstanding performance in various vision tasks by splitting images into patches and passing them through transformer blocks. However, the large model size and computational cost of ViT result in high inference latency and hindered acceleration. To accelerate ViT efficiently, we introduce ToMato(Token Merging at Once), a simple framework that recursively merges tokens by comparing similarity to adjacent tokens at the first transformer block. Applying the ToMato to DeiT-base model, we find that this reduces latency by 22.19% while maintaining high Top-1 accuracy of 80.14%.

How to install

git clone our repository to your computer

git clone https://github.com/Transformer04/ToMato.git

How to test

If you want to evaluate the accuracy of our model, enter <test_batch.py> file and change the directory path to your dataset in line 40. Then, run test_batch.py

python test_batch.py

Datasets

Test and validation were conducted using the Imagenet-mini-1000 dataset. The dataset can be checked at the following link. https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000

Experiment Results

Here are some expected results when using the timm implementation off-the-shelf on ImageNet-1k val using a V100:

Model	Top-1 acc (%)	Top-5 acc (%)	Latency (s)
DeiT-B	81.41	953	13.2132
ToMe-B	84.57	309	13
OURS-B	85.82	95	7

License and Contributing

This code has been implemented with reference to ToMe's code. Official PyTorch implemention of ToMe from the paper: Token Merging: Your ViT but Faster.
Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, Judy Hoffman.

Please refer to the CC-BY-NC 4.0. For contributing, see contributing and the code of conduct.

@inproceedings{bolya2022tome,
  title={Token Merging: Your {ViT} but Faster},
  author={Bolya, Daniel and Fu, Cheng-Yang and Dai, Xiaoliang and Zhang, Peizhao and Feichtenhofer, Christoph and Hoffman, Judy},
  booktitle={International Conference on Learning Representations},
  year={2023}
}

minseo10 / tomato Goto Github PK

tomato's Introduction

ToMato: Token Merging at Once

About our code

How to install

How to test

Datasets

Experiment Results

License and Contributing

tomato's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs