mit-han-lab / apq Goto Github PK

View Code? Open in Web Editor NEW

156.0 10.0 32.0 615 KB

[CVPR 2020] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Home Page: https://hanlab.mit.edu

License: Other

Python 100.00%

compression joint-optimization nas quantization

apq's Introduction

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

@inproceedings{Wang2020APQ,
  title={APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy},
  author={Tianzhe Wang and Kuan Wang and Han Cai and Ji Lin and Zhijian Liu and Song Han},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Overview

We release the PyTorch code for the APQ. [Paper|Video|Competition]:

Jointly Search for Optimal Model

Save Orders of Magnitude Searching Cost

Better Performance than Sequential Design

How to Use

Prerequisites

- Pytorch version >= 1.0
- Python version >= 3.6
- Progress >= 1.5
- For getting new models, you'll need the NVIDIA GPU

Dataset and Model Preparation

Download ImageNet dataset and put it into dataset/imagenet.
Download checkpoints for quantization-aware predictor and once-for-all network, put them into models folder.

Codebase Structure

apq
- dataset (imagenet data path)
- elastic_nn (super network builder , w/ or w/o quantization)
    - modules (define the layers, w/ or w/o quantization)
    - networks (define the networks, w/ or w/o quantization)
    utils.py (some utility functions for elastic_nn folder)
- models (quantzation-aware predictor and once-for-all network checkpoint path)
- imagenet_codebase (training codebase for imagenet)
- lut (latency lookup table path)
- methods (methods to find the mixed-precision network)
    - evolution (evolution search code)
- utils (some utility functions, including converter)
    accuracy_predictor.py (construction of accuracy predictor)
    latency_predictor.py (construction of latency predictor)
    converter.py (encode a subnetwork in to 1-hot vector)
    quant-aware.py (code for quantization-aware training)
main.py
Readme.md

Testing

For instance, if you want to test the model under exps/test folder.

Run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py \
    --exp_dir=exps/test

You will get the exact information (latency/energy) running on BitFusion platform and ImageNet Top-1 accuracy.

Example

Evolution search

For instance, if you want to search a model under 12.80ms latency constraint.

Run the following command:

CUDA_VISIBLE_DEVICES=0 python search.py \
    --mode=evolution \
    --acc_predictor_dir=models \
    --exp_name=test \
    --constraint=12.80 \
    --type=latency

You will get the candidate under the resource constraints (latency or energy), which is stored in exps/test folder.

Quantization-aware finetune on imagenet

For instance, if you want to quantization-aware finetuning for the model under exps/test folder.

Run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python quant_aware.py \
    --exp_name=test

You will get a mixed-precision model under the resource constraints (latency or energy) with considerable performance.

Models

We provide the checkpoints for our APQ reported in the paper:

Latency	Energy	BitOps	Accuracy	Model
6.11ms	9.14mJ	12.7G	72.8%	download
8.45ms	11.81mJ	14.6G	73.8%	download
8.40ms	12.18mJ	16.5G	74.1%	download
12.17ms	14.14mJ	23.6G	75.1%	download

You can download the models and put it into exps folder to test the performance. Note that the bold item means the search under that constraint.

Related work on automated model compression and acceleration:

Once for All: Train One Network and Specialize it for Efficient Deployment (ICLR'20, code)

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR’19)

AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV’18)

HAQ: Hardware-Aware Automated Quantization (CVPR’19, oral)

Defenstive Quantization: When Efficiency Meets Robustness (ICLR'19)

apq's People

Contributors

Stargazers

Watchers

apq's Issues

Dataset Construction For Training The Accuracy Predictor

Hello,

Can you please provide more details about how the data was collected to train the accuracy predictor? To be more specific, can you provide details about how the sampling was done from the once-for-all network? What optimizer did you use and for how many epochs did you train the network? What accuracy have you achieved?

Thanks in advance,

Search not improving after ~1000 iterations

I did multiple runs of search and usually after 800-1000 iterations there is no improvement. Is it a desired result? Given large search space I would expect the search to require far more iterations to converge on an optimal configuration.

Example partial output of search.py:

Iter: 0 Acc: 0.8829606175422668
Iter: 100 Acc: 0.8930274844169617
Iter: 200 Acc: 0.8938609957695007 
Iter: 300 Acc: 0.8939877152442932
Iter: 400 Acc: 0.8941074013710022
Iter: 500 Acc: 0.8942490816116333
Iter: 600 Acc: 0.8942490816116333
Iter: 700 Acc: 0.8942490816116333
Iter: 800 Acc: 0.8942490816116333
Iter: 900 Acc: 0.8943868279457092
Iter: 1000 Acc: 0.8944254517555237
Iter: 1100 Acc: 0.8945743441581726
Iter: 1200 Acc: 0.8945743441581726
Iter: 1300 Acc: 0.8945743441581726
Iter: 1400 Acc: 0.8945743441581726
Iter: 1500 Acc: 0.8945743441581726
Iter: 1600 Acc: 0.8945743441581726
Iter: 1700 Acc: 0.8945743441581726
Iter: 1800 Acc: 0.8945743441581726
Iter: 1900 Acc: 0.8945743441581726
Iter: 2000 Acc: 0.8945743441581726
Iter: 2100 Acc: 0.8945743441581726
Iter: 2200 Acc: 0.8945743441581726
Iter: 2300 Acc: 0.8945743441581726
Iter: 2400 Acc: 0.8945743441581726
Iter: 2500 Acc: 0.8945743441581726
Iter: 2600 Acc: 0.8945743441581726
Iter: 2700 Acc: 0.8945743441581726
Iter: 2800 Acc: 0.8945743441581726
Iter: 2900 Acc: 0.8945743441581726
Iter: 3000 Acc: 0.8945743441581726
Iter: 3100 Acc: 0.8945743441581726
Iter: 3200 Acc: 0.8945743441581726
Iter: 3300 Acc: 0.8945743441581726
Iter: 3400 Acc: 0.8945743441581726
Iter: 3500 Acc: 0.8945743441581726
Iter: 3600 Acc: 0.8945743441581726
Iter: 3700 Acc: 0.8945743441581726
Iter: 3800 Acc: 0.8945743441581726
Iter: 3900 Acc: 0.8945743441581726
Iter: 4000 Acc: 0.8945743441581726
Iter: 4100 Acc: 0.8945743441581726
Iter: 4200 Acc: 0.8945743441581726
Iter: 4300 Acc: 0.8945743441581726
Iter: 4400 Acc: 0.8945743441581726
Iter: 4500 Acc: 0.8945743441581726
Iter: 4600 Acc: 0.8945743441581726
Iter: 4700 Acc: 0.8945743441581726
Iter: 4800 Acc: 0.8945743441581726
Iter: 4900 Acc: 0.8945743441581726
Iter: 5000 Acc: 0.8945743441581726

How did you get BitFusion Number

Hi:

Great work!
Could you also provide more information on how did you get BitFusion latency and energy?
I noticed that the BitFusion authors released a simulator (https://github.com/hsharma35/bitfusion).
Did you get latency and platform from this simulator?

Thanks!

Thanks.

mit-han-lab / apq Goto Github PK

apq's Introduction

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

Overview

Jointly Search for Optimal Model

Save Orders of Magnitude Searching Cost

Better Performance than Sequential Design

How to Use

Prerequisites

Dataset and Model Preparation

Codebase Structure

Testing

Example

Evolution search

Quantization-aware finetune on imagenet

Models

Related work on automated model compression and acceleration:

apq's People

Contributors

Stargazers

Watchers

Forkers

apq's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs