Issue Type Others onnx2tf version number <

The final outputs are now nearly identical. MobileFormer-e9.on

[MobileFormer] Converted model outputs values mismatch with original ones.,about pinto0309/onnx2tf

Comments (11)

kevinz8866 commented on May 17, 2024 1

If you want to use MobileFormer's pytorch repo to debug, I have a fork that fixed all the import issues in the office repo. https://github.com/kevinz8866/MobileFormer

from onnx2tf.

kevinz8866 commented on May 17, 2024 1

Hi PINTO,

Thank you so much all these updates. Sorry I was able to get back to you. I was traveling and a bit busy with the lunar new year. I will try to replicate what you had for me and yes, I will post some json files if I converted more models from onnx to keras. Thank you so much for making this package. I will keep using it and let you know if there is any issue!

from onnx2tf.

kevinz8866 commented on May 17, 2024

By the way, these models fails to save as a keras model as well, when I run tf.keras.models.save_model, in addition to failure in saving in H5. The same error you found as in #103.

from onnx2tf.

PINTO0309 commented on May 17, 2024

Although experimental, I am adding a validation function to investigate which operations of the model transformation produce errors in the output. I would eventually like to modify the tool itself to automatically check which tensors have large errors.

https://github.com/PINTO0309/onnx2tf/releases/tag/1.5.0

[Experimental] Added the ability to validate the model final output tensor in ONNX and TensorFlow.
https://numpy.org/doc/stable/reference/generated/numpy.allclose.html#numpy-allclose

numpy.allclose(a, b, rtol=0.0, atol=1e-04, equal_nan=True)

  -coto, --check_onnx_tf_outputs_elementwise_close
    Returns true if the two arrays, the output of onnx and the output of TF,
    are elementwise close within an acceptable range.

  -cotor CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL,\
    --check_onnx_tf_outputs_elementwise_close_rtol CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL
    The relative tolerance parameter.
    Default: 0.0

  -cotoa CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL,\
    --check_onnx_tf_outputs_elementwise_close_atol CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL
    The absolute tolerance parameter.
    Default: 1e-4

This option in combination with the --output_names_to_interrupt_model_conversion, -onimc option can be used to investigate which operations at which locations in the model cause errors in the output.
Since ONNX assumes NCHW and TensorFlow assumes NHWC output, a simple comparison of output tensors will not match values in most cases. Therefore, the tool automatically tries to match the final output tensor of TensorFlow to the shape of the output tensor of ONNX with a brute force check. If the shape still does not match or there is no exact matching value combination, Unmatched is assumed.

e.g.

onnx2tf -i xxx.onnx -coto -onimc keypoints descriptors scores scores_map

from onnx2tf.

PINTO0309 commented on May 17, 2024

OK - Add_116 - onnx::MatMul_556

onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -onimc onnx::MatMul_556 -cotoa 1e-1

OK - MatMul_117 - onnx::Add_560

onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -onimc onnx::Add_560 -cotoa 1e-1

OK - Reshape_119 - onnx::Transpose_569

onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -onimc onnx::Transpose_569 -cotoa 1e-1

NG - MatMul_123 - onnx::Mul_580

onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -onimc onnx::Mul_580 -cotoa 1e-1

OK - Reshape_122 - onnx::MatMul_579

Thus, we see that there is a problem in the processing of MatMul, the confluence of the models.

It seems that this Transpose interferes with the automatic tool conversion and confuses the dimensional transposition.

INFO: onnx_op_type: Reshape onnx_op_name: Reshape_119
INFO:  input_name.1: onnx::Reshape_561 shape: [4, 1, 128] dtype: float32
INFO:  input_name.2: onnx::Reshape_2804 shape: [4] dtype: <class 'numpy.int64'>
INFO:  output_name.1: onnx::Transpose_569 shape: [4, 1, 4, 32] dtype: float32
INFO: tf_op_type: reshape
INFO:  input.1.tensor: name: tf.compat.v1.transpose_6/transpose:0 shape: (4, 1, 128) dtype: <dtype: 'float32'> 
INFO:  input.2.shape: val: [4, 1, 4, -1] 
INFO:  output.1.output: name: tf.reshape_3/Reshape:0 shape: (4, 1, 4, 32) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Transpose onnx_op_name: Transpose_127
INFO:  input_name.1: onnx::MatMul_579 shape: [1, 4, 32, 4] dtype: float32
INFO:  output_name.1: onnx::MatMul_584 shape: [1, 4, 4, 32] dtype: float32
INFO: tf_op_type: transpose_v2
INFO:  input.1.a: name: tf.reshape_2/Reshape:0 shape: (1, 4, 32, 4) dtype: <dtype: 'float32'> 
INFO:  input.2.perm: val: [0, 1, 3, 2]
INFO:  output.1.output: name: tf.compat.v1.transpose_7/transpose:0 shape: (1, 4, 4, 32) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: Transpose onnx_op_name: Transpose_120
INFO:  input_name.1: onnx::Transpose_569 shape: [4, 1, 4, 32] dtype: float32
INFO:  output_name.1: onnx::MatMul_570 shape: [1, 4, 4, 32] dtype: float32
INFO: tf_op_type: transpose_v2
INFO:  input.1.a: name: tf.reshape_3/Reshape:0 shape: (4, 1, 4, 32) dtype: <dtype: 'float32'> 
INFO:  input.2.perm: val: [1, 0, 2, 3]   <<================================================= Here
INFO:  output.1.output: name: tf.compat.v1.transpose_8/transpose:0 shape: (1, 4, 4, 32) dtype: <dtype: 'float32'> 

INFO: onnx_op_type: MatMul onnx_op_name: MatMul_123
INFO:  input_name.1: onnx::MatMul_570 shape: [1, 4, 4, 32] dtype: float32
INFO:  input_name.2: onnx::MatMul_579 shape: [1, 4, 32, 4] dtype: float32
INFO:  output_name.1: onnx::Mul_580 shape: [1, 4, 4, 4] dtype: float32
INFO: tf_op_type: matmul
INFO:  input.1.a: name: tf.compat.v1.transpose_8/transpose:0 shape: (1, 4, 4, 32) dtype: <dtype: 'float32'> 
INFO:  input.2.b: name: tf.reshape_2/Reshape:0 shape: (1, 4, 32, 4) dtype: <dtype: 'float32'> 
INFO:  input.3.output_type: name: float32 shape: () 
INFO:  output.1.output: name: tf.linalg.matmul_4/MatMul:0 shape: (1, 4, 4, 4) dtype: <dtype: 'float32'>

Therefore, add a parameter to the JSON to disable Transpose, which would confuse the transposition. The tool internally and automatically attempts to convert the perm attribute of Transpose from NCHW to NHWC. Therefore, if the model has unnecessary transpose from the beginning, it may generate wrong transpose. The following JSON forces the perm attribute of Transpose to fix the behavior of the tool and disable the automatic perm NHWC conversion behavior.

replace_kevinz8866.json

{
  "format_version": 1,
  "operations": [
    {
      "op_name": "Reshape_247",
      "param_target": "outputs",
      "param_name": "onnx::Add_753",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_418",
      "param_target": "outputs",
      "param_name": "onnx::Add_1015",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_588",
      "param_target": "outputs",
      "param_name": "onnx::Add_1275",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_759",
      "param_target": "outputs",
      "param_name": "onnx::Add_1537",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_929",
      "param_target": "outputs",
      "param_name": "onnx::Add_1797",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_1098",
      "param_target": "outputs",
      "param_name": "onnx::Add_2056",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_1269",
      "param_target": "outputs",
      "param_name": "onnx::Add_2318",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_1439",
      "param_target": "outputs",
      "param_name": "onnx::Add_2578",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Transpose_120",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    }
  ]
}

OK - MatMul_123 - onnx::Mul_580

onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -onimc onnx::Mul_580 -cotoa 1e-1

From a bird's eye view of the model, it appears to have multiple Reshape -> Transpose -> MatMul structures, so I would need to go through the same steps to modify the tool's behavior. It is a bit tedious.

from onnx2tf.

PINTO0309 commented on May 17, 2024

Also, I see that since you match every onnx operation with a basic tf operation, is there no way to restore those weights as trainable parameters? Thank you so much for keep digging in on this. If you need the original pytorch model or anything else I could assist on, let me know!

The tool targets specialized transformations to inferrable models, and it is very hard to build trainable models. Instead of restricting functionality, the structure of the model is optimized to the limit. Thus, operations that are necessary only during training and unnecessary during inference, such as BatchNormalization and Dropout, are intentionally separated and fused to optimize and disappear from the model.

First, define the model structure as a Functional model in Keras. This means that the Python code defines the model structure.
Extracts only the weights from the transformed model. At this time the ability to extract weights is not present in this tool, but I will try to add the feature in the future. At this time, weights can be extracted using Netron. Clicking on the floppy disk icon saves a binary file in np.ndarray format to storage.

e.g.

import numpy as np
print(np.load('tensor'))

input_weights = np.load('tensor')

[[[[ 3.85814160e-02 -5.91791682e-02  4.93610911e-02]
   [-1.54751437e-02 -2.65599549e-01  2.85035670e-01]
   [-2.80502457e-02 -2.88973451e-01  2.98822671e-01]]

  [[ 4.93560582e-02 -6.98780492e-02  6.08931743e-02]
   [-1.96552109e-02 -4.20744419e-01  4.15196180e-01]
   [-1.72943212e-02 -4.09793824e-01  4.53572363e-01]]

  [[-2.43881089e-03 -2.27344222e-02  3.65414075e-04]
   [ 1.85506195e-02 -1.68236122e-01  1.95923150e-01]
   [-5.82838431e-03 -1.79103076e-01  1.61797389e-01]]]


 [[[ 1.88791680e+00  4.31701469e+00  4.70997620e+00]

Load the weights extracted in 2. into the Keras model as initializers. If you need to set the bias, use bias_initializer.

from tensorflow.python.keras.layers import Conv2D

Conv2D(
    filters=input_weights.shape[-1],
    kernel_size=input_weights.shape[:2],
    strides=strides,
    padding='valid',
    dilation_rate=dilations,
    groups=group,
    use_bias=False,
    kernel_initializer=tf.keras.initializers.constant(input_weights),
    name='dummy_conv2d',
)(input_tensor)

from onnx2tf.

PINTO0309 commented on May 17, 2024

Added the ability for the tool to automatically identify operations with large model accuracy errors.
--check_onnx_tf_outputs_elementwise_close_full option.

Kazam_screencast_00100_.mp4

from onnx2tf.

PINTO0309 commented on May 17, 2024

By the way, these models fails to save as a keras model as well, when I run tf.keras.models.save_model, in addition to failure in saving in H5. The same error you found as in #103.

It is not possible to save the entire structure of the model to an h5 file, but I have added the ability to extract only the weights and save them to an h5 file. hdf5 format files are output.

-ow, --output_weights
  Output weights in hdf5 format.

from onnx2tf.

PINTO0309 commented on May 17, 2024

The final outputs are now nearly identical.

MobileFormer-e9.onnx
https://drive.google.com/file/d/1vGzO9MZGX-yGz6ATm4yHVJMASZACuy2t/view?usp=share_link
MobileFormer-e9.tflite
model_float32.tflite.tar.gz

replace_kevinz8866.json

{
  "format_version": 1,
  "operations": [
    {
      "op_name": "Reshape_247",
      "param_target": "outputs",
      "param_name": "onnx::Add_753",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_418",
      "param_target": "outputs",
      "param_name": "onnx::Add_1015",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_588",
      "param_target": "outputs",
      "param_name": "onnx::Add_1275",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_759",
      "param_target": "outputs",
      "param_name": "onnx::Add_1537",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_929",
      "param_target": "outputs",
      "param_name": "onnx::Add_1797",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_1098",
      "param_target": "outputs",
      "param_name": "onnx::Add_2056",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_1269",
      "param_target": "outputs",
      "param_name": "onnx::Add_2318",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Reshape_1439",
      "param_target": "outputs",
      "param_name": "onnx::Add_2578",
      "post_process_transpose_perm": [0,2,3,1]
    },
    {
      "op_name": "Transpose_120",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_126",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_129",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_291",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_297",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_300",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_462",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_468",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_471",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_632",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_638",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_641",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_803",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_809",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_812",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_973",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_979",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_982",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_1142",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_1148",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_1151",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_1313",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_1319",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_1322",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    },
    {
      "op_name": "Transpose_1444",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [1,2,0,3]
    },
    {
      "op_name": "Softmax_1449",
      "param_target": "attributes",
      "param_name": "axis",
      "values": 3
    },
    {
      "op_name": "Transpose_1452",
      "param_target": "attributes",
      "param_name": "perm",
      "values": [2,0,1,3]
    }
  ]
}

$ onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -coto

$ onnx2tf -i mobileformer.onnx -prf replace_kevinz8866.json -rerf -coto

A unique feature of TransFormer is that there are several blocks that prevent the tool from converting, as shown in the figure below. Therefore, I checked the structure in Netron and made the same behavioral changes for blocks with the same structure; I copied and pasted most of the JSON.

from onnx2tf.

PINTO0309 commented on May 17, 2024

Because of the unpredictable transpositions in the tool's automatic transformations, I have implemented a number of enhancements to identify where the errors occur.

I believe that MobileFormer other than e9 can be converted with almost the same accuracy by changing the behavior of the tool based on the same criteria.

Once I close this issue, if you have successfully converted a model other than e9, I think many researchers and engineers would be pleased if you could pull request a sample JSON file here.
https://github.com/PINTO0309/onnx2tf/tree/main/json_samples

from onnx2tf.

PINTO0309 commented on May 17, 2024

@kevinz8866 Fixed a fatal bug, allowing converted models to be output to Keras (.h5). However, trainable=False.
https://github.com/PINTO0309/onnx2tf/releases/tag/1.5.30

from onnx2tf.

[MobileFormer] Converted model outputs values mismatch with original ones. about onnx2tf HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs