GithubHelp home page GithubHelp logo

ziplab / spt Goto Github PK

View Code? Open in Web Editor NEW
57.0 4.0 2.0 15.82 MB

[ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.

License: MIT License

Shell 4.89% HTML 0.28% Ruby 0.08% CSS 0.20% JavaScript 2.20% Python 92.36%
parameter-efficient-fine-tuning peft adapter lora prompt-tuning transfer-learning

spt's Introduction

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

[ICCV 2023] (oral) This is the official repository for our paper: Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning by Haoyu He, Jianfei Cai, Jing Zhang, Dacheng Tao and Bohan Zhuang.


๐Ÿš€ News

[2023-09-12]: Release code.

[2023-08-12]: Got accepted by ICCV 2023 for oral presentation!


Introduction:

Instead of presenting another architecture with learnable parameters in parameter-effcient fine-tuning (PEFT), our work emphasizes the importance to put PEFT architectures to optimal positions tailored for diverse tasks!

main

Our SPT consists of two stages.

Stage 1 is a very quick one-shot parameter sensitivity estimation (several seconds) to find where to introduce trainable parameters. The following are some interesting sensitivity patterns in various pre-trained ViTs with top-0.4% sensitive parameters. We find that the proportions exhibit task-specific varying patterns in terms of network depth and task-agnostic similar patterns in terms of operations.

depth operation

Stage 2 is standard PEFT that keep the majority parameters frozen and only fine-tune the trainable ones. Our SPT introduces trainable parameters to the sensitive positions in two granularities: the unstructured neurons and structured PEFT modules (e.g., LoRA or Adapter) to achieve good performance!

performance

If you find this repository or our paper useful, please consider cite and star us!

@inproceedings{he2023sensitivity,
  title={Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning},
  author={He, Haoyu and Cai, Jianfei and Zhang, Jing and Tao, Dacheng and Zhuang, Bohan},
  booktitle={ICCV},
  year={2023}
}

Getting started on SPT:

Install dependency:

We have tested our code on both Torch 1.8.0, and 1.10.0. Please install the other dependencies with the following code in the home directory:

pip install -r requirements.txt

Data preparation:

We provide training and inference code for our main benchmark VTAB-1k.

cd data/vtab-source
python get_vtab1k.py

PS: You may have to manually install Sun397. Please refer to VTAB-1k.

Download pre-trained models:

Please download the backbones with the following code:

cd checkpoints

# Supervised pre-trained ViT-B/16
wget https://console.cloud.google.com/storage/browser/_details/vit_models/imagenet21k/ViT-B_16.npz

# MAE pre-trained ViT-B/16
wget https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth

# MoCo V3 pre-trained ViT-B/16
wget https://dl.fbaipublicfiles.com/moco-v3/vit-b-300ep/linear-vit-b-300ep.pth.tar

Get parameter sensitivity:

We have provided the following code (we have already provided the sensitivity for supervised pre-trained ViT-B/16 in sensitivity_spt_supervised_lora_a10 and sensitivity_spt_supervised_adapter_a10).

# SPT-ADAPTER and SPT-LORA with supervised pre-trained ViT-B/16
bash configs/vtab_mae_spt_lora_sensitivity.sh
bash configs/vtab_mae_spt_adapter_sensitivity.sh

# SPT-ADAPTER and SPT-LORA with MAE pre-trained ViT-B/16
bash configs/vtab_mae_spt_lora_sensitivity.sh
bash configs/vtab_mae_spt_adapter_sensitivity.sh

# SPT-ADAPTER and SPT-LORA with MoCo V3 pre-trained ViT-B/16 
bash configs/vtab_mae_spt_lora_sensitivity.sh
bash configs/vtab_mae_spt_adapter_sensitivity.sh

PEFT with SPT:

We have provided the following training code:

# SPT-ADAPTER and SPT-LORA with supervised pre-trained ViT-B/16
bash configs/vtab_supervised_spt_lora.sh
bash configs/vtab_supervised_spt_adapter.sh

# SPT-ADAPTER and SPT-LORA with MAE pre-trained ViT-B/16
bash configs/vtab_mae_spt_lora.sh
bash configs/vtab_mae_spt_adapter.sh

# SPT-ADAPTER and SPT-LORA with MoCo V3 pre-trained ViT-B/16 
bash configs/vtab_moco_spt_lora.sh
bash configs/vtab_moco_spt_adapter.sh

PS: we sweep different trainable parameter budgets to seek potential better results (from 0.2M to 1.0M).

TODO:

- [x] Release code for SPT on ViTs.
- [ ] Release code for FGVC benchmark training (ETA October).
- [ ] Release code for SPT on Swin (ETA October).
- [ ] Release code for SPT on ConvNext (ETA October).
- [ ] Integrate to [PEFT](https://github.com/huggingface/peft) package.

Acknowledgements:

Our code is modified from NOAH, CoOp, AutoFormer, timm, and mmcv. We thank the authors for their open-sourced code.

spt's People

Contributors

charleshhy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

spt's Issues

ModuleNotFoundError: No module named 'model.swin_transformer'

Traceback (most recent call last):
File "train_spt.py", line 18, in
from model.vision_transformer_timm import VisionTransformerSepQKV
File "/home/lyc/TNTprojectz/KE/SPT/model/init.py", line 2, in
from .swin_transformer import *
ModuleNotFoundError: No module named 'model.swin_transformer'

image
image

Question about the reproducing result on vtab-1k using ViT-B/16 pre-trained on ImageNet-21k.

Hi, great work! We are unable to achieve the results of the experiment described in the title of the paper. But we can reproduce the results of pre-training the model using other datasets. Can you give us some suggestions for experiments? The following are the experimental results under the parameter quantity of 0.4M. Looking forward to hearing from you.

70.82, 92.42, 71.54, 99.28, 87.22, 55.20, 91.22,Natural.
85.57, 95.96, 85.60, 74.31,Specialized.
81.83, 66.81, 49.36, 78.76, 79.02, 49.38, 27.73, 38.11,Structured.

Questions about the sensitivity function

Hello, thanks for providing the code.
I have some questions about calculating sensitivity, and I appreciate it if you could clarify them for me.

  1. What values of alpha and beta should generally be used?
  2. in your experience, how many batches should be processed for reliable estimation of sensitivity?
  3. In L181 what do the values denote? Are they the number of total tunable parameters to select?
  4. Could you explain how the sweep is performed in, and why the value of 80 is chosen in L189?
  5. can you explain this condition in L282 in your code? When I run the code it only return results with for 1.0, 0.8 and 0.6, and for smaller values the condition does not satisfy apparently.
  6. In L279, can you explain why param count is calculated in this way? What is the division by 1e6 performed?
  7. In L191 and L196, why param_num is multiplied by 0.02 and 1e6 respectively?
  8. When using LoRA, I assume the additional parameters will be merged into the original params after training is done. Is the code for that available?

Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.