Comments (6)
Greetings, @nsk-lab !
The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in-model. No additional Model Optimizer parameters are required to handle such models - the INT8 IR will be produced automatically if you supply an INT8 ONNX as input.
from nncf.
@vshampor Thank you for your reply.
I could train my model with int8 optimization and export to ONNX file (which include FakeQantize nodes).
Also I could convert it to IR model.
But when I load the IR model by openvino.inference_engine.ie_api.IECore.load_network
, following error occurs
File "ie_api.pyx", line 178, in openvino.inference_engine.ie_api.IECore.load_network
File "ie_api.pyx", line 187, in openvino.inference_engine.ie_api.IECore.load_network
RuntimeError: Incorrect number of input edges for layer 560/variance/Fused_Add_
I attach visualization by netron.
It seems that the top left Add node is ok but bottom left Add node 560/variance/Fused_Add_
has wrong number of input even though the netron visualizer shows 2 input node as I expected.
Could you give me advice?
Is 560/variance/Fused_Add_
interpreted as having only 1 input? or some other problem?
any suggestions are welcome
from nncf.
@nsk-lab does the problem persist in the FP32 IR, or does it only appear for the INT8 IR?
from nncf.
When I trained without NNCF and exported to onnx, I could convert to IR and load the IR model (inference result is also ok).
I did not try a model trained by NNCF with FP32 (and converted to IR)
from nncf.
@vshampor (I forget mention)
from nncf.
@nsk-lab strange as it is, I'm inclined to believe that this is a bug in OpenVINO's Model Optimizer. Please raise an issue within the https://github.com/openvinotoolkit/openvino repository which owns Model Optimizer. In the meanwhile you can try using the "export_to_onnx_standard_ops": true
option to produce an ONNX INT8 model which uses ONNX's standard QuantizeLinear/DequantizeLinear operators so that it can be loadable using onnxruntime
. Successfully running INT8 inference through onnxruntime
can give you additional confidence that NNCF's output ONNX files are OK.
from nncf.
Related Issues (20)
- Compressed models that call torch.is_floating_point() during inference are traced with runtime error.
- nncf + ultralytics yolov8 training-time compression HOT 7
- Ultralytics yolov8 QAT example HOT 1
- [Good First Issue] [NNCF] Make NNCF common utils code pass mypy checks HOT 23
- [Good First Issue] [NNCF] Make NNCF common accuracy aware training code pass mypy checks HOT 17
- [Good First Issue] [NNCF] Make NNCF common tensor statistics code pass mypy checks HOT 9
- [Good First Issue] [NNCF] Make NNCF common pruning code pass mypy checks HOT 14
- [Good First Issue] [NNCF] Make NNCF common graph code pass mypy checks HOT 26
- [Good First Issue] [NNCF] Make NNCF common sparsity code pass mypy checks HOT 6
- Thanks to our Contributors HOT 1
- [Good First Issue][NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model HOT 19
- [Good First Issue][NNCF]: Fixing NNCFGraph export for visualization in Netron HOT 6
- Why doesn't the size and precision of the model change after INT4 quantization? HOT 2
- [Good First Issue][NNCF]: Optimize memory footprint by removing redundant collected statistics HOT 8
- [Good First Issue][NNCF]: Dump actual_subset_size to ov.Model HOT 8
- [Good First Issue][NNCF]: dump the ignored scope more gracefully HOT 4
- [Good First Issue][NNCF]: check number of u8, u4 constants in weight compression tests HOT 10
- PTQ of Fast R-CNN crashes in PyTorch backend HOT 1
- [Good First Issue][NNCF]: fix invalid error reporting in JSON schema HOT 19
- [Good First Issue][NNCF]: Add tests for torch device utils HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nncf.