Comments (8)
This would I think complement the FPGA implementation very well - allowing us to remove some hardware blocks.
from nncf.
Probably, it makes sense to consider similar insight from HAWQ-v3 paper:
"We follow a simpler approach where we first keep the Conv and BN layer unfolded, and then we perform standard QAT by backpropagating the gradients as usual. After several epochs, we freeze the running statistics in the BN layer and only then perform CONV+BN folding and quantized to the target bit-width"
from nncf.
@asenina please post current results.
from nncf.
Model | Pytorch measured | Pytorch reference FP32 | Openvino AC measured |
---|---|---|---|
unet_camvid_int8 (with folding conv and bn) | 71. 57 % | 71.95 % | 71.67 % |
from nncf.
@asenina Thanks you for your work. I have several questions.
-
What is "pytorch measured", "Openvino AC measured" means?
-
It seems that they got similar result on segmentation task. Is there a experiment on detection task?
If it is possible, I would be glad to test on more tasks using NNCF. Can the code (qat with folding conv and bn) be accessable to me?
from nncf.
- "pytorch measured" - accuracy was measured on final compressed model in NNCF. (Use option
--mode test
and--resume path-to-result-checkpoint
)
"Openvino AC measured" - Final compressed model was converted to onnx, then to IR format. Then accuracy was measured on the model in IR format (use openvino runtime and accuraсy checker) - I have not done experiments on detection task.
Please, track this in #446 . I will be glad if you test it on other task using NNCF and share the results (if possible). Thanks
from nncf.
@asenina Thanks for your reply! I will go into it soon.
from nncf.
Folding was implemented according to the following formulas:
Initial implementation
Results:
Model name | Acc (without folding) | Acc (with folding) |
---|---|---|
squeezenet1_1_imagenet_int8 | 58.04 % | 58.14 % |
unet_camvid_int8 | 71.66 % | 72.03 % |
icnet_camvid_int8 | 67.78 % | 67.45 % |
mobilenet_v2_imagnet_int8 | 71.31 % | 71.67 % |
mobilenet_v2_imagnet_int4_int8 | 71.03 % | 70.98 % |
BatchNorm folding gives minor improvements of accuracy only for int8 models. It will not merge to NNCF.
from nncf.
Related Issues (20)
- Compressed models that call torch.is_floating_point() during inference are traced with runtime error.
- nncf + ultralytics yolov8 training-time compression HOT 7
- Ultralytics yolov8 QAT example HOT 1
- [Good First Issue] [NNCF] Make NNCF common utils code pass mypy checks HOT 23
- [Good First Issue] [NNCF] Make NNCF common accuracy aware training code pass mypy checks HOT 17
- [Good First Issue] [NNCF] Make NNCF common tensor statistics code pass mypy checks HOT 9
- Thanks to our Contributors HOT 1
- [Good First Issue][NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model HOT 19
- [Good First Issue][NNCF]: Fixing NNCFGraph export for visualization in Netron HOT 6
- Why doesn't the size and precision of the model change after INT4 quantization? HOT 2
- [Good First Issue][NNCF]: Optimize memory footprint by removing redundant collected statistics HOT 8
- [Good First Issue][NNCF]: Dump actual_subset_size to ov.Model HOT 8
- [Good First Issue][NNCF]: dump the ignored scope more gracefully HOT 4
- [Good First Issue][NNCF]: check number of u8, u4 constants in weight compression tests HOT 10
- PTQ of Fast R-CNN crashes in PyTorch backend HOT 1
- [Good First Issue][NNCF]: fix invalid error reporting in JSON schema HOT 19
- [Good First Issue][NNCF]: Add tests for torch device utils HOT 5
- [Good First Issue][NNCF]: Remove compress_to_fp16=False from examples HOT 3
- AttributeError: 'list' object has no attribute 'keys' when executing yolov8_quantize_with_accuracy_control example HOT 4
- The question about function create_compressed_model():RuntimeError: CUDA error: device-side assert triggered HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nncf.