GithubHelp home page GithubHelp logo

mit-han-lab / apq Goto Github PK

View Code? Open in Web Editor NEW
156.0 10.0 32.0 615 KB

[CVPR 2020] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Home Page: https://hanlab.mit.edu

License: Other

Python 100.00%
compression joint-optimization nas quantization

apq's Introduction

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

@inproceedings{Wang2020APQ,
  title={APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy},
  author={Tianzhe Wang and Kuan Wang and Han Cai and Ji Lin and Zhijian Liu and Song Han},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Overview

We release the PyTorch code for the APQ. [Paper|Video|Competition]:

Jointly Search for Optimal Model

Save Orders of Magnitude Searching Cost

Better Performance than Sequential Design

How to Use

Prerequisites

- Pytorch version >= 1.0
- Python version >= 3.6
- Progress >= 1.5
- For getting new models, you'll need the NVIDIA GPU

Dataset and Model Preparation

Codebase Structure

apq
- dataset (imagenet data path)
- elastic_nn (super network builder , w/ or w/o quantization)
    - modules (define the layers, w/ or w/o quantization)
    - networks (define the networks, w/ or w/o quantization)
    utils.py (some utility functions for elastic_nn folder)
- models (quantzation-aware predictor and once-for-all network checkpoint path)
- imagenet_codebase (training codebase for imagenet)
- lut (latency lookup table path)
- methods (methods to find the mixed-precision network)
    - evolution (evolution search code)
- utils (some utility functions, including converter)
    accuracy_predictor.py (construction of accuracy predictor)
    latency_predictor.py (construction of latency predictor)
    converter.py (encode a subnetwork in to 1-hot vector)
    quant-aware.py (code for quantization-aware training)
main.py
Readme.md

Testing

For instance, if you want to test the model under exps/test folder.

Run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py \
    --exp_dir=exps/test

You will get the exact information (latency/energy) running on BitFusion platform and ImageNet Top-1 accuracy.

Example

Evolution search

For instance, if you want to search a model under 12.80ms latency constraint.

Run the following command:

CUDA_VISIBLE_DEVICES=0 python search.py \
    --mode=evolution \
    --acc_predictor_dir=models \
    --exp_name=test \
    --constraint=12.80 \
    --type=latency

You will get the candidate under the resource constraints (latency or energy), which is stored in exps/test folder.

Quantization-aware finetune on imagenet

For instance, if you want to quantization-aware finetuning for the model under exps/test folder.

Run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python quant_aware.py \
    --exp_name=test

You will get a mixed-precision model under the resource constraints (latency or energy) with considerable performance.

Models

We provide the checkpoints for our APQ reported in the paper:

Latency Energy BitOps Accuracy Model
6.11ms 9.14mJ 12.7G 72.8% download
8.45ms 11.81mJ 14.6G 73.8% download
8.40ms 12.18mJ 16.5G 74.1% download
12.17ms 14.14mJ 23.6G 75.1% download

You can download the models and put it into exps folder to test the performance. Note that the bold item means the search under that constraint.

Related work on automated model compression and acceleration:

Once for All: Train One Network and Specialize it for Efficient Deployment (ICLR'20, code)

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR’19)

AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV’18)

HAQ: Hardware-Aware Automated Quantization (CVPR’19, oral)

Defenstive Quantization: When Efficiency Meets Robustness (ICLR'19)

apq's People

Contributors

usedtobe97 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

apq's Issues

Dataset Construction For Training The Accuracy Predictor

Hello,

Can you please provide more details about how the data was collected to train the accuracy predictor? To be more specific, can you provide details about how the sampling was done from the once-for-all network? What optimizer did you use and for how many epochs did you train the network? What accuracy have you achieved?

Thanks in advance,

Search not improving after ~1000 iterations

I did multiple runs of search and usually after 800-1000 iterations there is no improvement. Is it a desired result? Given large search space I would expect the search to require far more iterations to converge on an optimal configuration.

Example partial output of search.py:

Iter: 0 Acc: 0.8829606175422668
Iter: 100 Acc: 0.8930274844169617
Iter: 200 Acc: 0.8938609957695007 
Iter: 300 Acc: 0.8939877152442932
Iter: 400 Acc: 0.8941074013710022
Iter: 500 Acc: 0.8942490816116333
Iter: 600 Acc: 0.8942490816116333
Iter: 700 Acc: 0.8942490816116333
Iter: 800 Acc: 0.8942490816116333
Iter: 900 Acc: 0.8943868279457092
Iter: 1000 Acc: 0.8944254517555237
Iter: 1100 Acc: 0.8945743441581726
Iter: 1200 Acc: 0.8945743441581726
Iter: 1300 Acc: 0.8945743441581726
Iter: 1400 Acc: 0.8945743441581726
Iter: 1500 Acc: 0.8945743441581726
Iter: 1600 Acc: 0.8945743441581726
Iter: 1700 Acc: 0.8945743441581726
Iter: 1800 Acc: 0.8945743441581726
Iter: 1900 Acc: 0.8945743441581726
Iter: 2000 Acc: 0.8945743441581726
Iter: 2100 Acc: 0.8945743441581726
Iter: 2200 Acc: 0.8945743441581726
Iter: 2300 Acc: 0.8945743441581726
Iter: 2400 Acc: 0.8945743441581726
Iter: 2500 Acc: 0.8945743441581726
Iter: 2600 Acc: 0.8945743441581726
Iter: 2700 Acc: 0.8945743441581726
Iter: 2800 Acc: 0.8945743441581726
Iter: 2900 Acc: 0.8945743441581726
Iter: 3000 Acc: 0.8945743441581726
Iter: 3100 Acc: 0.8945743441581726
Iter: 3200 Acc: 0.8945743441581726
Iter: 3300 Acc: 0.8945743441581726
Iter: 3400 Acc: 0.8945743441581726
Iter: 3500 Acc: 0.8945743441581726
Iter: 3600 Acc: 0.8945743441581726
Iter: 3700 Acc: 0.8945743441581726
Iter: 3800 Acc: 0.8945743441581726
Iter: 3900 Acc: 0.8945743441581726
Iter: 4000 Acc: 0.8945743441581726
Iter: 4100 Acc: 0.8945743441581726
Iter: 4200 Acc: 0.8945743441581726
Iter: 4300 Acc: 0.8945743441581726
Iter: 4400 Acc: 0.8945743441581726
Iter: 4500 Acc: 0.8945743441581726
Iter: 4600 Acc: 0.8945743441581726
Iter: 4700 Acc: 0.8945743441581726
Iter: 4800 Acc: 0.8945743441581726
Iter: 4900 Acc: 0.8945743441581726
Iter: 5000 Acc: 0.8945743441581726

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.