GithubHelp home page GithubHelp logo

chengtao-lv / ptq4sam Goto Github PK

View Code? Open in Web Editor NEW
53.0 3.0 4.0 15.66 MB

[CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything

Shell 0.13% Python 15.59% Jupyter Notebook 84.07% Dockerfile 0.01% Makefile 0.01% CSS 0.01% Batchfile 0.01% C++ 0.02% Cuda 0.17%

ptq4sam's Introduction

PTQ4SAM: Post-Training Quantization for Segment Anything (CVPR 2024)

Chengtao Lv*, Hong Chen*, Jinyang GuoπŸ“§, Yifu Ding, Xianglong Liu

(* denotes equal contribution, πŸ“§ denotes corresponding author.)

Overview

overview Segment Anything Model (SAM) has achieved impressive performance in many computer vision tasks. However, as a large-scale model, the immense memory and computation costs hinder its practical deployment. In this paper, we propose a post-training quantization (PTQ) framework for Segment Anything Model, namely PTQ4SAM. First, we investigate the inherent bottleneck of SAM quantization attributed to the bimodal distribution in post-Key-Linear activations. We analyze its characteristics from both per-tensor and per-channel perspectives, and propose a Bimodal Integration strategy, which utilizes a mathematically equivalent sign operation to transform the bimodal distribution into a relatively easy-quantized normal distribution offline. Second, SAM encompasses diverse attention mechanisms (i.e., self-attention and two-way cross-attention), resulting in substantial variations in the post-Softmax distributions. Therefore, we introduce an Adaptive Granularity Quantization for Softmax through searching the optimal power-of-two base, which is hardware-friendly.

Create Environment

🍺🍺🍺 You can refer the environment.sh in the root directory or install step by step.

  1. Install PyTorch
conda create -n ptq4sam python=3.7 -y
pip install torch torchvision
  1. Install MMCV
pip install -U openmim
mim install "mmcv-full<2.0.0"
  1. Install other requirements
pip install -r requirements.txt
  1. Compile CUDA operators
cd projects/instance_segment_anything/ops
python setup.py build install
cd ../../..
  1. Install mmdet
cd mmdetection/
python3 setup.py build develop
cd ..

Prepare Dataset and Models

Download the official COCO dataset, put them into the corresponding folders of datasets/ and recollect them as the following form:

β”œβ”€β”€ data
β”‚   β”œβ”€β”€ coco
β”‚   β”‚   β”œβ”€β”€ annotations
β”‚   β”‚   β”œβ”€β”€ train2017
β”‚   β”‚   β”œβ”€β”€ val2017
β”‚   β”‚   β”œβ”€β”€ test2017

Download the pretrain weights (SAM and detectors), put them into the corresponding folders of ckpt/:

Usage

To perform quantization on models, specify the model configuration and quantization configuration. For example, to perform W6A6 quantization for SAM-B with a YOLO detector, use the following command:

python ptq4sam/solver/test_quant.py \
--config ./projects/configs/yolox/yolo_l-sam-vit-l.py \
--q_config exp/config66.yaml --quant-encoder
  • yolo_l-sam-vit-l.py: configuration file for the SAM-B model with YOLO detector.
  • config66.yaml: configuration file for W6A6 quantization.
  • quant-encoder: quant the encoder of SAM.

We recommend using a GPU with more than 40GB for experiments. If you want to visualize the prediction results, you can achieve this by specifying --show-dir. Bimodal distributions mainly occur in the mask decoder of SAM-B and SAM-L.

Reference

If you find this repo useful for your research, please consider citing the paper.

@inproceedings{lv2024ptq4sam,
  title={PTQ4SAM: Post-Training Quantization for Segment Anything},
  author={Lv, Chengtao and Chen, Hong and Guo, Jinyang and Ding, Yifu and Liu, Xianglong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={15941--15951},
  year={2024}
}

Acknowledgments

The code of PTQ4SAM was based on Prompt-Segment-Anything and QDrop. We thank for their open-sourced code.

ptq4sam's People

Contributors

chengtao-lv avatar

Stargazers

 avatar  avatar  avatar Peiqi Li avatar Zhuang kailai avatar DaHoon Park avatar yaxu avatar Ko Dae Won avatar  avatar yixinzhang avatar  avatar  avatar Gabriel Leclerc avatar  avatar Ben Longo avatar Du jinyang avatar Park Sang kil avatar Hogan Kangas avatar  avatar CircuitNeuroNet avatar Lucas B. Ferreira avatar dawei-hao avatar  avatar Jarvis65536 avatar LZD avatar yanglianwei avatar Muhammad Kaleem Ullah avatar Jiang Tao avatar  avatar  avatar  avatar An-zhi WANG avatar  avatar Manickavela avatar Jeff Carpenter avatar Xingyu Zheng avatar  avatar  avatar  avatar ranpin avatar Matt Shaffer avatar flywwwfly avatar  avatar PuHan avatar An Wang avatar  avatar Hyogon Ryu avatar MillX avatar  avatar TSL avatar  avatar  avatar Ge Yang avatar

Watchers

εˆ˜ε›½ε‹ avatar  avatar Matt Shaffer avatar

ptq4sam's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.