GithubHelp home page GithubHelp logo

Comments (18)

PINTO0309 avatar PINTO0309 commented on June 14, 2024

The yolov8 thread is too long to read.

Clearly state here which operations to skip and from which operations. It's too much trouble to find out.

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

The idea would be to have a general option, maybe in PARAM_REPLACEMENT_FILE, so a user could edit each model accordingly. Yolov8 is just an example where it could be beneficial. For the attached yolov8 model visualization, the Concatenation and every OP after it should be excluded.

from onnx2tf.

PINTO0309 avatar PINTO0309 commented on June 14, 2024

There is more than one Concat. I didn't know which Concat he was talking about, so I cut it down appropriately. And I still don't understand what you expect me to do.

onnx2tf \
-i yolov8_n.onnx \
-onimc /model.22/Sigmoid_output_0 /model.22/Sub_1_output_0 /model.22/Div_output_0

image

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

Hey,

sorry for the confusion. Let me try to explain with a visual example. This is a toy example, the model makes no sense its just for illustration.

The left is the fully quantized INT8 model, which is the output of the onnx2tf tool. The right model is what I would like to have. I would like to insert a dequantization and keep everything after dequantization in FP32.
model

from onnx2tf.

PINTO0309 avatar PINTO0309 commented on June 14, 2024

Are you saying that the position of the inverse quantization should be controllable by onnx2tf? That is impossible. Let me know if you know of any of the TensorFlow converter parameters below that can handle your request.

I would have implemented such a feature a year ago if I knew how.

https://github.com/tensorflow/tensorflow/blob/6e6ca51e99c8d46c401ad11982cbf846c3a4071f/tensorflow/lite/python/lite.py#L603-L678

class TFLiteConverterBase:
  """Converter superclass to share functionality between V1 and V2 converters."""

  # Stores the original model type temporarily to transmit the information
  # from the factory class methods to TFLiteConverterBase init function.
  _original_model_type = conversion_metdata_fb.ModelType.NONE

  def __init__(self):
    self.optimizations = set()
    self.representative_dataset = None
    self.target_spec = TargetSpec()
    self.allow_custom_ops = False
    self.experimental_new_converter = True
    self.experimental_new_quantizer = True
    self.experimental_enable_resource_variables = True
    self._experimental_calibrate_only = False
    self._experimental_sparsify_model = False
    self._experimental_disable_per_channel = False
    self._debug_info = None  # contains the stack traces of all the original
    # nodes in the `GraphDef` to the converter.
    self.saved_model_dir = None
    self._saved_model_tags = None
    self._saved_model_version = 0
    self._saved_model_exported_names = []
    self._tflite_metrics = metrics.TFLiteConverterMetrics()
    self._collected_converter_params = {}
    self.unfold_batchmatmul = False
    self.legalize_custom_tensor_list_ops = False
    self._experimental_lower_tensor_list_ops = True
    self._experimental_default_to_single_batch_in_tensor_list_ops = False
    self._experimental_unfold_large_splat_constant = False
    self._experimental_tf_quantization_mode = None
    # If unset, bias:int32 is by default except 16x8 quant.
    # For 16x8 quant, bias:int64 is used to prevent any overflow by default.
    # The accumulator type will be the same as bias type set by
    # full_integer_quantization_bias_type.
    self._experimental_full_integer_quantization_bias_type = None
    # Provides specs for quantization, whether preset or custom.
    self._experimental_quantization_options = None  # Deprecated
    self.experimental_use_stablehlo_quantizer = False
    # Initializes conversion metadata.
    self.exclude_conversion_metadata = False
    self._metadata = conversion_metdata_fb.ConversionMetadataT()
    self._metadata.environment = conversion_metdata_fb.EnvironmentT()
    self._metadata.options = conversion_metdata_fb.ConversionOptionsT()
    self._metadata.environment.tensorflowVersion = versions.__version__
    self._metadata.environment.modelType = self._get_original_model_type()
    self._experimental_enable_dynamic_update_slice = False
    self._experimental_preserve_assert_op = False
    self._experimental_guarantee_all_funcs_one_use = False

    # When the value is true, the MLIR quantantizer triggers dynamic range
    # quantization in MLIR instead of the old quantizer. Used only if
    # experimental_new_quantizer is on.
    self.experimental_new_dynamic_range_quantizer = True
    # Experimental flag to enable low-bit QAT in 8 bit.
    self._experimental_low_bit_qat = False
    # Experimental flag to add all TF ops (including custom TF ops) to the
    # converted model as flex ops.
    self._experimental_allow_all_select_tf_ops = False

    self._experimental_variable_quantization = False
    self._experimental_disable_fuse_mul_and_fc = False
    self._experimental_use_buffer_offset = False
    self._experimental_reduce_type_precision = False
    self._experimental_qdq_conversion_mode = None

    # Debug parameters
    self.ir_dump_dir = None
    self.ir_dump_pass_regex = None
    self.ir_dump_func_regex = None
    self.enable_timing = None
    self.print_ir_before = None
    self.print_ir_after = None
    self.print_ir_module_scope = None
    self.elide_elementsattrs_if_larger = None

from onnx2tf.

PINTO0309 avatar PINTO0309 commented on June 14, 2024

I understand that it is a toy model, but I don't understand at all the significance of not separating the last Mul and Add of tflite from the model. In the first place, it is Float32 multiplication and addition, so writing two lines of multiplication and addition on the program side would not make any difference in performance.

This idea can only be used if primitive operations are followed, though.

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

I dont know if thats possible with TFLite, but would be good. For example OpenVINOs NNCF convert() function do accept a parameter named ignored_scope.

Yes, it is a toy example and toy example are meant to be simple. In YOLO8, after the Concat (not the last before the output) everything should be executed in FP32.

from onnx2tf.

PINTO0309 avatar PINTO0309 commented on June 14, 2024

I finally understand what you are intending.

I will say it again. It is impossible. Submit a feature request issue to TensorFlow.

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

Oh, all right, sad to hear. Thanks for your help.

from onnx2tf.

EpiX-1 avatar EpiX-1 commented on June 14, 2024

Hi @adamp87,

I think you can achieve what you want by converting your onnx model to keras using the -oh5 option from onnx2tf.
Then you can quantize your keras model with the TensorFlow API using tf.lite.experimental.QuantizationDebugger() which let you specify nodes/operators to skip during the quantization process (check tf.lite.experimental.QuantizationDebugger()).
Hope this helps.

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

Hi @EpiX-1,

thank you so much for the suggestion. I think this could go the right way. Sadly Im having an issue and not sure how to go further. Are you familiar with onnx2tf or could you give me a hint what could be the problem?

By running the following in Colab:

!git clone https://github.com/adamp87/ultralytics.git
%cd /content/ultralytics
!git checkout tflite_accurate
!pip install -e .
!yolo export model=yolov8n.pt data=coco128.yaml format=tflite imgsz=640

I get this error during the init of QuantizationDebugger:
'/model.10/Resize' is not a valid root scope name. A root scope name has to match the following pattern: ^[A-Za-z0-9.][A-Za-z0-9_.\\/>-]*$

You can find my code here: GitHub Compare

Thanks for your help

from onnx2tf.

EpiX-1 avatar EpiX-1 commented on June 14, 2024

I'm able to reproduce the issue. Your code seems good to me.
I've managed to resolve the issue by reverting to an older commit of YOLOv8. I have no idea why it occurs, though.

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

from onnx2tf.

EpiX-1 avatar EpiX-1 commented on June 14, 2024

Sure,
What I meant is that I've cloned the original YOLOv8 repository at the specific commit I provided you, to export the .pt model to onnx. After converting manually the onnx model to keras using onnx2tf, I've reused my code, which is very similar to yours, to quantize the keras model with the QuantizationDebugger.
I hope it's clearer now.

from onnx2tf.

adamp87 avatar adamp87 commented on June 14, 2024

from onnx2tf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.