Original paper: https://

Probably, it makes sense to consider similar insight from <a href="https://arxiv.org/p

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Model Pytorch measured Pytorch reference FP3

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Folding was implemented according to the following formulas: <a target="_blank" re

Investigate benefits of adding BatchNorm folding to NNCF quantization about nncf HOT 8 CLOSED

openvinotoolkit commented on June 7, 2024 3

Investigate benefits of adding BatchNorm folding to NNCF quantization

from nncf.

Comments (8)

RikAllen commented on June 7, 2024

This would I think complement the FPGA implementation very well - allowing us to remove some hardware blocks.

from nncf.

ljaljushkin commented on June 7, 2024

Probably, it makes sense to consider similar insight from HAWQ-v3 paper:
"We follow a simpler approach where we first keep the Conv and BN layer unfolded, and then we perform standard QAT by backpropagating the gradients as usual. After several epochs, we freeze the running statistics in the BN layer and only then perform CONV+BN folding and quantized to the target bit-width"

from nncf.

vshampor commented on June 7, 2024

@asenina please post current results.

from nncf.

asenina commented on June 7, 2024

Model	Pytorch measured	Pytorch reference FP32	Openvino AC measured
unet_camvid_int8 (with folding conv and bn)	71. 57 %	71.95 %	71.67 %

from nncf.

Zehaos commented on June 7, 2024

@asenina Thanks you for your work. I have several questions.

What is "pytorch measured", "Openvino AC measured" means?
It seems that they got similar result on segmentation task. Is there a experiment on detection task?

If it is possible, I would be glad to test on more tasks using NNCF. Can the code (qat with folding conv and bn) be accessable to me?

from nncf.

asenina commented on June 7, 2024

@Zehaos

"pytorch measured" - accuracy was measured on final compressed model in NNCF. (Use option --mode test and --resume path-to-result-checkpoint)
"Openvino AC measured" - Final compressed model was converted to onnx, then to IR format. Then accuracy was measured on the model in IR format (use openvino runtime and accuraсy checker)
I have not done experiments on detection task.

Please, track this in #446 . I will be glad if you test it on other task using NNCF and share the results (if possible). Thanks

from nncf.

Zehaos commented on June 7, 2024

@asenina Thanks for your reply! I will go into it soon.

from nncf.

asenina commented on June 7, 2024

Folding was implemented according to the following formulas:

Initial implementation
Results:

Model name	Acc (without folding)	Acc (with folding)
squeezenet1_1_imagenet_int8	58.04 %	58.14 %
unet_camvid_int8	71.66 %	72.03 %
icnet_camvid_int8	67.78 %	67.45 %
mobilenet_v2_imagnet_int8	71.31 %	71.67 %
mobilenet_v2_imagnet_int4_int8	71.03 %	70.98 %

BatchNorm folding gives minor improvements of accuracy only for int8 models. It will not merge to NNCF.

from nncf.

Recommend Projects

Investigate benefits of adding BatchNorm folding to NNCF quantization about nncf HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs