GithubHelp home page GithubHelp logo

zysxmu / intraq Goto Github PK

View Code? Open in Web Editor NEW
30.0 1.0 1.0 2.02 MB

Pytorch implementation of our paper accepted by CVPR 2022 -- IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

Python 99.96% Shell 0.04%
acceleration compression zero-shot quantization

intraq's Introduction

CVPR 2022 paper - IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization paper

Requirements

Python >= 3.7.10

Pytorch == 1.7.1

Reproduce results

Stage1: Generate data.

cd data_generate

Please install all required package in requirements.txt.

"--save_path_head" in run_generate_cifar10.sh/run_generate_cifar100.sh is the path where you want to save your generated data pickle.

For cifar10/100

bash run_generate_cifar10.sh
bash run_generate_cifar100.sh

For ImageNet

"--save_path_head" in run_generate.sh is the path where you want to save your generated data pickle.

"--model" in run_generate.sh is the pre-trained model you want (also is the quantized model). You can use resnet18/mobilenet_w1/mobilenetv2_w1.

bash run_generate.sh

Stage2: Train the quantized network

cd ..
  1. Modify "qw" and "qa" in cifar10_resnet20.hocon/cifar100_resnet20.hocon/imagenet.hocon to select desired bit-width.

  2. Modify "dataPath" in cifar10_resnet20.hocon/cifar100_resnet20.hocon/imagenet.hocon to the real dataset path (for construct the test dataloader).

  3. Modify the "Path_to_data_pickle" in main_direct.py (line 122 and line 135) to the data_path and label_path you just generate from Stage1.

  4. Use the below commands to train the quantized network. Please note that the model that generates the data and the quantized model should be the same.

For cifar10/100

python main_direct.py --model_name resnet20_cifar10 --conf_path cifar10_resnet20.hocon --id=0

python main_direct.py --model_name resnet20_cifar100 --conf_path cifar100_resnet20.hocon --id=0

For ImageNet, you can choose the model by modifying "--model_name" (resnet18/mobilenet_w1/mobilenetv2_w1)

python main_direct.py --model_name resnet18 --conf_path imagenet.hocon --id=0

Evaluate pre-trained models

The pre-trained models and corresponding logs can be downloaded here

Please make sure the "qw" and "qa" in *.hocon, *.hocon, "--model_name" and "--model_path" are correct.

For cifar10/100

python test.py --model_name resnet20_cifar10 --model_path path_to_pre-trained model --conf_path cifar10_resnet20.hocon

python test.py --model_name resnet20_cifar100 --model_path path_to_pre-trained model --conf_path cifar100_resnet20.hocon

For ImageNet

python test.py --model_name resnet18/mobilenet_w1/mobilenetv2_w1 --model_path path_to_pre-trained model --conf_path imagenet.hocon

Results of pre-trained models are shown below:

Model Bit-width Dataset Top-1 Acc.
resnet18 W4A4 ImageNet 66.47%
resnet18 W5A5 ImageNet 69.94%
mobilenetv1 W4A4 ImageNet 51.36%
mobilenetv1 W5A5 ImageNet 68.17%
mobilenetv2 W4A4 ImageNet 65.10%
mobilenetv2 W5A5 ImageNet 71.28%
resnet-20 W3A3 cifar10 77.07%
resnet-20 W4A4 cifar10 91.49%
resnet-20 W3A3 cifar100 64.98%
resnet-20 W4A4 cifar100 48.25%

intraq's People

Contributors

zysxmu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

666dzy666

intraq's Issues

Confusion about step_S settings

Thanks for your work!
You mentioned in the paper that "Both learning rates are decayed by 0.1 every 100 fine-tuning epochs." But I noticed that in the code, you set the LR tuning strategy to [20,40,60] for CIFAR10. Looking forward to some explanation from you on this.
Thank you again.

Results in paper with DSG method

Hi, thanks for your work!
I was wondering why the result of DSG in the Table.3 of your paper is lower than ZeroQ? Since in their paper they claim that DSG performs better than ZeroQ. As far as I know, there was no official open-source code for DSG, but I noticed in your paper (Section 4.1) that you used open-source code. So, did you reproduce DSG personally or had I misunderstood something? Maybe there's something wrong with DSG?

Thanks for your reply

Why the accuracy of ZeroQ is much higher than the result reported in GDFQ?

Hello!
I was wondering why the result of ZeroQ you reported in the Table.1 of your paper (W4A4, 60.68%on ImageNet) is so high? It's even higher than GDFQ. I clone ZeroQ's code and the result is similar to GDFQ's paper(~26%). Besides, since ZeroQ's synthetic data don't have its label(without IL), how do you perform fine-tuning on 4-bit ResNet18(caption of Table.1)?
Looking forward to your reply! Thx!

Questions on marginal distance constraints in the code

Thanks for sharing your code. It was very helpful in understanding your interesting work.
I have some questions about your code.

For generating the data for cifar100, it seems that the marginal distance constraints (loss_cosineDistance and loss_cosineDistance_upper in distill_data.py) are both 0 during the data generation process.
Is this a code error or am I missing something?

Also, could you explain why you used 1-CosineSimilarilty instead of CosineSimilarity in your code?

Thanks in advance!

Question about GAN in training process.

Thanks for your work!I notice that in network fine-tuning stage,you used a GAN network to generate data and used the data to train. But this was not mentioned in the paper. If you have time to explain the reason, I'd be very grateful.

About Generator

Hi, thanks for your work!
It seems that there is a generator being trained in trainer_direct.py, while the paper says the data for fine-tuning is obtained by optimizing Gaussian noise but not a generator. I am not familiar with zero-shot quantization, and this makes me confused. Could you please help me figure it out? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.