GithubHelp home page GithubHelp logo

Comments (6)

vshampor avatar vshampor commented on May 28, 2024

Greetings, @nsk-lab !

The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in-model. No additional Model Optimizer parameters are required to handle such models - the INT8 IR will be produced automatically if you supply an INT8 ONNX as input.

from nncf.

nsk-lab avatar nsk-lab commented on May 28, 2024

@vshampor Thank you for your reply.

I could train my model with int8 optimization and export to ONNX file (which include FakeQantize nodes).
Also I could convert it to IR model.

But when I load the IR model by openvino.inference_engine.ie_api.IECore.load_network, following error occurs

  File "ie_api.pyx", line 178, in openvino.inference_engine.ie_api.IECore.load_network
  File "ie_api.pyx", line 187, in openvino.inference_engine.ie_api.IECore.load_network
RuntimeError: Incorrect number of input edges for layer 560/variance/Fused_Add_

I attach visualization by netron.
It seems that the top left Add node is ok but bottom left Add node 560/variance/Fused_Add_ has wrong number of input even though the netron visualizer shows 2 input node as I expected.
image

Could you give me advice?
Is 560/variance/Fused_Add_ interpreted as having only 1 input? or some other problem?
any suggestions are welcome

from nncf.

vshampor avatar vshampor commented on May 28, 2024

@nsk-lab does the problem persist in the FP32 IR, or does it only appear for the INT8 IR?

from nncf.

nsk-lab avatar nsk-lab commented on May 28, 2024

When I trained without NNCF and exported to onnx, I could convert to IR and load the IR model (inference result is also ok).
I did not try a model trained by NNCF with FP32 (and converted to IR)

from nncf.

nsk-lab avatar nsk-lab commented on May 28, 2024

@vshampor (I forget mention)

from nncf.

vshampor avatar vshampor commented on May 28, 2024

@nsk-lab strange as it is, I'm inclined to believe that this is a bug in OpenVINO's Model Optimizer. Please raise an issue within the https://github.com/openvinotoolkit/openvino repository which owns Model Optimizer. In the meanwhile you can try using the "export_to_onnx_standard_ops": true option to produce an ONNX INT8 model which uses ONNX's standard QuantizeLinear/DequantizeLinear operators so that it can be loadable using onnxruntime. Successfully running INT8 inference through onnxruntime can give you additional confidence that NNCF's output ONNX files are OK.

from nncf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.