GithubHelp home page GithubHelp logo

alexanderlutsenko / nobuco Goto Github PK

View Code? Open in Web Editor NEW
187.0 8.0 8.0 4.83 MB

Pytorch to Keras/Tensorflow conversion made intuitive

License: MIT License

Python 100.00%
converter keras pytorch tensorflow conversion deep-learning machine-learning model-conversion model-converter tensorflow-js

nobuco's People

Contributors

alexanderlutsenko avatar kokeshing avatar on-jungwoan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nobuco's Issues

Parameter count mismatch when converting `nn.TransformerEncoderLayer`

Hey @AlexanderLutsenko,

my apologies for bugging you so soon again after you resolved my other request.
I noticed, that when I'm converting a nn.TransformerEncoderLayer, the parameter counts are mismatched, despite the fact that the conversion proceeds without issue (all green according to nobuco).

The problem seems to come from the linear1 which for some reason, doesn't get constructed with the right dimensions (or as a normal Dense layer). The correct number of parameters would be 128 x 256 + 256 = 33'024. However, the resulting tensorflow model seems to construct a layer of size 512 x 256 = 131'072.

torchinfo's summary:

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
TransformerEncoderLayer                  [1, 512, 128]             --
├─MultiheadAttention: 1-1                [1, 512, 128]             66,048
├─Dropout: 1-2                           [1, 512, 128]             --
├─LayerNorm: 1-3                         [1, 512, 128]             256
├─Linear: 1-4                            [1, 512, 256]             33,024
├─Dropout: 1-5                           [1, 512, 256]             --
├─Linear: 1-6                            [1, 512, 128]             32,896
├─Dropout: 1-7                           [1, 512, 128]             --
├─LayerNorm: 1-8                         [1, 512, 128]             256
==========================================================================================
Total params: 132,480
Trainable params: 132,480
Non-trainable params: 0
Total mult-adds (M): 0.07
==========================================================================================
Input size (MB): 0.26
Forward/backward pass size (MB): 2.62
Params size (MB): 0.27
Estimated Total Size (MB): 3.15
==========================================================================================

Keras's summary:

__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(1, 128, 512)]              0         []                            
                                                                                                  
 tf.compat.v1.transpose_2 (  (1, 512, 128)                0         ['input_1[0][0]']             
 TFOpLambda)                                                                                      
                                                                                                  
 multi_head_attention_2 (Mu  (1, 512, 128)                66048     ['tf.compat.v1.transpose_2[0][
 ltiHeadAttention)                                                  0]',                          
                                                                     'tf.compat.v1.transpose_2[0][
                                                                    0]',                          
                                                                     'tf.compat.v1.transpose_2[0][
                                                                    0]']                          
                                                                                                  
 dropout_1 (Dropout)         (1, 512, 128)                0         ['multi_head_attention_2[1][0]
                                                                    ']                            
                                                                                                  
 tf.compat.v1.transpose_3 (  (1, 128, 512)                0         ['dropout_1[0][0]']           
 TFOpLambda)                                                                                      
                                                                                                  
 weight_layer_4 (WeightLaye  (1, 512, 256)                131072    ['input_1[0][0]']             
 r)                                                                                               
                                                                                                  
 tf.__operators__.add (TFOp  (1, 128, 512)                0         ['input_1[0][0]',             
 Lambda)                                                             'tf.compat.v1.transpose_3[0][
                                                                    0]']                          
                                                                                                  
 dropout_2 (Dropout)         (1, 512, 256)                0         ['weight_layer_4[0][0]']      
                                                                                                  
 tf.compat.v1.transpose_4 (  (1, 512, 128)                0         ['tf.__operators__.add[0][0]']
 TFOpLambda)                                                                                      
                                                                                                  
 dense_1 (Dense)             (1, 512, 128)                32896     ['dropout_2[0][0]']           
                                                                                                  
 layer_normalization (Layer  (1, 512, 128)                256       ['tf.compat.v1.transpose_4[0][
 Normalization)                                                     0]']                          
                                                                                                  
 dropout_3 (Dropout)         (1, 512, 128)                0         ['dense_1[0][0]']             
                                                                                                  
 tf.__operators__.add_1 (TF  (1, 512, 128)                0         ['layer_normalization[0][0]', 
 OpLambda)                                                           'dropout_3[0][0]']           
                                                                                                  
 layer_normalization_1 (Lay  (1, 512, 128)                256       ['tf.__operators__.add_1[0][0]
 erNormalization)                                                   ']                            
                                                                                                  
 tf.compat.v1.transpose_5 (  (1, 128, 512)                0         ['layer_normalization_1[0][0]'
 TFOpLambda)                                                        ]                             
                                                                                                  
 tf.identity (TFOpLambda)    (1, 128, 512)                0         ['tf.compat.v1.transpose_5[0][
                                                                    0]']                          
                                                                                                  
==================================================================================================
Total params: 230528 (900.50 KB)
Trainable params: 230528 (900.50 KB)
Non-trainable params: 0 (0.00 Byte)
__________________________________________________________________________________________________

To reproduce:

import torch
import torch.nn as nn
import nobuco
from nobuco import ChannelOrder
from torchinfo import summary

pytorch_module = nn.TransformerEncoderLayer(128, 4, dim_feedforward=256, batch_first=True).eval()
#pytorch_module = nn.TransformerEncoderLayer(128, 4, dim_feedforward=256, batch_first=True).linear1.eval()
dummy_image = torch.rand(size=(1, 512, 128))

print(pytorch_module(dummy_image).mean())
print(summary(pytorch_module, dummy_image.shape))
keras_model = nobuco.pytorch_to_keras(
    pytorch_module,
    args=[dummy_image], kwargs=None,
    inputs_channel_order=ChannelOrder.TENSORFLOW,
    outputs_channel_order=ChannelOrder.TENSORFLOW
)

print(keras_model.summary())

Any idea why this could happen?

Thank you for releasing this wonderful repository.

Thanks for your very clear insight and support on my repository issue. 😄

I just wanted to post this issue to say thanks. Sorry if this is a nuisance to you.
The purpose of the repository (onnx2tf) I was creating was to generate TensorFlow models from PyTorch, but your tool is much more complete. The source code is very clean and I am very impressed.

So, I have a question: I will of course consider issuing a pull request to this repository someday, but may I use your clean source code as a reference and quote your implementation to my repository? Since the overall design of the tools is very different, it is not possible to quote them in exactly the same way, but clean OP as conversion patterns are very helpful. For example, implementations of While-Loop and GridSample.

However, I would like to contribute to this repository someday, as I cannot rely on you all the time.
I am sorry for taking up so much of your valuable time.
Again, thanks.

Failed to conversion a convolution when padding argument is 'valid' or default

First of all, thank you for developing this great project!

I found an event that failed in the convolution conversion when its padding argument is 'valid'(nn.Conv) or default(F.conv).

UserWarning: Conversion exception on node 'Conv1d': The `padding` argument must be a tuple of 2 integers. Received: v
    raise Exception(f'Failed conversion: {self.original_node}')
Exception: Failed conversion: Conv1d(128, 128, kernel_size=(1,), stride=(1,), padding=valid)

UserWarning: Validation exception on node 'conv1d': Failed conversion: <built-in method conv1d of type object at 0x105101780>
Exception: Failed conversion: <built-in method conv1d of type object at 0x122109780>

ValueError: `padding` should have two elements. Received: valid.
    raise Exception(f'Failed conversion: {self.original_node}')
Exception: Failed conversion: Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), padding=valid)

UserWarning: Validation exception on node 'ErrorModel': Failed conversion: <built-in method conv2d of type object at 0x122101980>
Exception: Failed conversion: <built-in method conv2d of type object at 0x122101510>

The code to be reproduced is as follows:

import nobuco
import torch
import torch.nn as nn
import torch.nn.functional as F
from nobuco import ChannelOrder


class Model(nn.Module):
    def __init__(self):
        super().__init__()

        self.conv0_weight = nn.Parameter(torch.randn(128, 3, 3, 3))
        self.conv0_bias = nn.Parameter(torch.randn(128))

        self.conv1 = nn.Conv2d(128, 128, 1, 1, padding=0)

        self.conv2_weight = nn.Parameter(torch.randn(128, 128, 3))
        self.conv2_bias = nn.Parameter(torch.randn(128))

        self.conv3 = nn.Conv1d(128, 128, 1, 1, padding=0)

    def forward(self, x):
        x = F.conv2d(x, self.conv0_weight, self.conv0_bias, padding="same")
        x = F.relu(x)
        x = self.conv1(x)
        x = F.relu(x)

        x = x.reshape(x.shape[0], x.shape[1], -1)
        x = F.conv1d(x, self.conv2_weight, self.conv2_bias, padding="same")
        x = F.relu(x)
        x = self.conv3(x)
        x = F.relu(x)

        return x


class ErrorModel(nn.Module):
    def __init__(self):
        super().__init__()

        self.conv0_weight = nn.Parameter(torch.randn(128, 3, 3, 3))
        self.conv0_bias = nn.Parameter(torch.randn(128))

        self.conv1 = nn.Conv2d(128, 128, 1, 1, "valid")

        self.conv2_weight = nn.Parameter(torch.randn(128, 128, 3))
        self.conv2_bias = nn.Parameter(torch.randn(128))

        self.conv3 = nn.Conv1d(128, 128, 1, 1, "valid")

    def forward(self, x):
        x = F.conv2d(x, self.conv0_weight, self.conv0_bias)
        x = F.relu(x)
        x = self.conv1(x)
        x = F.relu(x)

        x = x.reshape(x.shape[0], x.shape[1], -1)
        x = F.conv1d(x, self.conv2_weight, self.conv2_bias)
        x = F.relu(x)
        x = self.conv3(x)
        x = F.relu(x)

        return x


def main():
    dummy_image = torch.rand(size=(1, 3, 64, 64))

    model = Model().eval()
    _ = nobuco.pytorch_to_keras(
        model,
        args=[dummy_image],
        kwargs=None,
        inputs_channel_order=ChannelOrder.TENSORFLOW,
        outputs_channel_order=ChannelOrder.TENSORFLOW,
    )

    error_model = ErrorModel().eval()
    _ = nobuco.pytorch_to_keras(
        error_model,
        args=[dummy_image],
        kwargs=None,
        inputs_channel_order=ChannelOrder.TENSORFLOW,
        outputs_channel_order=ChannelOrder.TENSORFLOW,
    )  # This will raise an error


if __name__ == "__main__":
    main()

Keras symbolic inputs/outputs do not implement `__len__`

Hi! your library is amazing, thank you so much!
I'm trying to convert this LLM https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b from pytorch to tensorflow, as usual the input is dynamic, when I run:

keras_model = nobuco.pytorch_to_keras(
model,
args=[padded_input], kwargs=None,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW
)

the conversion works flawlessly and the resulting keras model produces the same result as the original model, except that the input size of the keras model is fixed to whatever the size of the padded_input was.

If instead I run the conversion like so:
keras_model = nobuco.pytorch_to_keras(
model,
args=[padded_input], kwargs=None,
input_shapes={padded_input: (1, None)},
trace_shape=True,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW
)

then it crashes towards the end of the conversion with error:
TypeError: Keras symbolic inputs/outputs do not implement __len__. You may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model. This error will also get raised if you try asserting a symbolic input/output directly.

Any pointers of what the problem might be?

Troubleshooting Evaluation and Compilation Issues with Keras_yolov5s.h5 Conversion

I am trying to create a Keras_yolov5s.h5 file using nobuco/example/yolo5.py and then convert it to spiking neural networks. At this point, the evaluation of keras_yolov5s.h5 is necessary. When I try to run the example, it seems like it doesn't compile. How can I solve this issue?

Conversion complete. Elapsed time: 4.07 sec.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. model.compile_metrics will be empty until you train or evaluate the model.
Model saved
WARNING:tensorflow:No training configuration found in the save file, so the model was not compiled. Compile it manually.
Model loaded

TypeError: converter_mean() got an unexpected keyword argument 'keepdims'

This code has error

x = x.mean(0,keepdims=True)

  File "/home/titanx/hengck/opt/anaconda3.9/lib/python3.9/site-packages/nobuco/converters/node_converter.py", line 48, in decorator
    converter_result_func = converter_func(*args, **kwargs)
TypeError: converter_mean() got an unexpected keyword argument 'keepdims'

But i can get rid of it using:
x = x.mean(0).unsqueeze(0)

When groups equal input channels then PyTorch Conv2D is mapped to Keras DepthwiseConv2D

Hi!

First of all, thank you so much for this work! Hats off :)

Then, I noticed that at the nobuco/node_converters/convolution.py there is the:

is_depthwise = groups == in_filters

which then is used to select the keras.layers.DepthwiseConvXD version of the ConvXD layers in Keras. Though, this creates problems when the stride is not same in height and width. For example, in PyTorch one can do:

import torch
torch.nn.Conv2D(
    in_channels=4,
    out_channels=4,
    kernel_size=(1, 3),
    stride=(1, 2),
    groups=4,
)

but the above, and because of the is_depthwise = groups == in_filters, will be translated to

import keras, numpy as np
x = np.random.rand(4, 10, 10, 12)
keras.layers.DepthwiseConv2D(
    (1, 3),
    strides=(1, 2),
)(x)

Which then will give:

InvalidArgumentError: Exception encountered when calling layer 'depthwise_conv2d_1' (type DepthwiseConv2D).

{{function_node __wrapped__DepthwiseConv2dNative_device_/job:localhost/replica:0/task:0/device:CPU:0}} Current implementation only supports equal length strides in the row and column dimensions. [Op:DepthwiseConv2dNative] name: 

Call arguments received by layer 'depthwise_conv2d_1' (type DepthwiseConv2D):
  • inputs=tf.Tensor(shape=(4, 10, 10, 12), dtype=float32)

But, if typical Conv2D is used instead of DepthwiseConv2D, like:

import keras, numpy as np
x = np.random.rand(4, 10, 10, 12)
keras.layers.Conv2D(
    (1, 3),
    strides=(1, 2),
    groups=12
)(x)

then it will work.

I have quit using keras since before Theano was deprecated (was that 2017?) and I'm not familiar what is the actual difference between keras.layers.DepthwiseConv2D and keras.layers.Conv2D. But if it is just about the setting of the groups, then maybe either add one more checking (e.g. is_stride_ok = all([stride[0] == x for x in stride])) or maybe not use keras.layers.DepthwiseConv?

Any insights?

Thank you!

Add support for multidimensional aggregating function

First of all thank you for your amazing job on this project, it comes so handy for my projects.

Secondly, I found out that current default math conversion operations like sum, mean, etc. does not support multidimensional aggregation. For example, if pytorch code contains something like my_tensor.mean((2,3)), nobuco throws error during conversion.

TypeError: '<' not supported between instances of 'tuple' and 'int'

Code to reproduce:

class DummyModel(nn.Module):
    
    def __init__(self):
        super().__init__()
        
    def forward(self, x):
        return x.mean((2,3), keepdim=True)

model = DummyModel()

dummy_image = torch.randn(1, 3, 100, 100)

keras_model = nobuco.pytorch_to_keras(
    model,
    args=[dummy_image]
)

unnecessary python 3.9 limitation

Hi,
the str.removesuffix comes with python 3.9 and add unecessary version constraint.
Please consider switching:
all = [basename(f).removesuffix('.py') for f in modules if isfile(f) and not f.endswith('init.py')]
to:
all = [basename(f)[:-3] if f.endswith('.py') else basename(f) for f in modules if isfile(f) and not f.endswith('init.py')]

In addition the library is not intuitevly compatible with all torch versions (linalg comes starting > 1.17 and _six is depreceated for newer torch versions). it would be great if the user get an explicit requirements txt and add an instruction to installing the library as developper mode (-e . ).
Best regards
Rayen

Imprecise conversion for custom log_softmax converter

(Edit)

Hello, I am working to convert LightGlue to Tensorflow (I ultimately want to get to TFlite) using nobuco + some help from ChatGPT to create some of the conversion functions ;)

I am still in the process of performing the conversion, but had a question. I'm seeing very imprecise conversion, and am not sure why this would be the case. I'm trying to rule out any issues in my implementation.

Here is the conversion function I am using for log_softmax:

@converter(torch.nn.functional.log_softmax, channel_ordering_strategy=ChannelOrderingStrategy.MINIMUM_TRANSPOSITIONS)
def converter_log_softmax(input, dim, dtype=None):
    def func(input, dim, dtype=None):
        # Adjust 'dim' if it's negative to handle PyTorch's negative indexing
        if dim < 0:
            dim += len(input.shape)

        # Apply TensorFlow's log_softmax
        # If dtype is specified, cast the input tensor to this dtype first
        if dtype is not None:
            input = tf.cast(input, dtype)
        return tf.nn.log_softmax(input, axis=dim)

    return func

Nobuco is indicating a significant discrepancy for log_softmax in the log:

/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<class 'lightglue.lightglue.TransformerLayer'>|LightGlue] conversion procedure might be incorrect: max. discrepancy for output #1 is 0.00010 (0.004%)
  warnings.warn(warn_string, category=RuntimeWarning)
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<class 'lightglue.lightglue.TransformerLayer'>|LightGlue] conversion procedure might be incorrect: max. discrepancy for output #0 is 0.00012 (0.005%)
  warnings.warn(warn_string, category=RuntimeWarning)
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<function log_softmax at 0x7bc0c800de10>|LightGlue->MatchAssignment] conversion procedure might be incorrect: max. discrepancy for output #0 is 38.75780 (103.477%)
  warnings.warn(warn_string, category=RuntimeWarning)
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<class 'lightglue.lightglue.MatchAssignment'>|LightGlue] conversion procedure might be incorrect: max. discrepancy for output #0 is 38.75780 (43.686%)
  warnings.warn(warn_string, category=RuntimeWarning)

Here is the code snippet calling log_softmax:

# Original implementation
# def sigmoid_log_double_softmax(
#     sim: torch.Tensor, z0: torch.Tensor, z1: torch.Tensor
# ) -> torch.Tensor:
#     """create the log assignment matrix from logits and similarity"""
#     b, m, n = sim.shape
#     certainties = F.logsigmoid(z0) + F.logsigmoid(z1).transpose(1, 2)
#     scores0 = F.log_softmax(sim, 2)
#     scores1 = F.log_softmax(sim.transpose(-1, -2).contiguous(), 2).transpose(-1, -2)
#     scores = sim.new_full((b, m + 1, n + 1), 0)
#     scores[:, :m, :n] = scores0 + scores1 + certainties
#     scores[:, :-1, -1] = F.logsigmoid(-z0.squeeze(-1))
#     scores[:, -1, :-1] = F.logsigmoid(-z1.squeeze(-1))
#     return scores

# My implementation with some modifications to eliminate slicing
def sigmoid_log_double_softmax(sim: torch.Tensor, z0: torch.Tensor, z1: torch.Tensor) -> torch.Tensor:
    """create the log assignment matrix from logits and similarity"""
    b, m, n = sim.shape

    # Calculate certainties and scores0, scores1 as before
    certainties = F.logsigmoid(z0) + F.logsigmoid(z1).transpose(1, 2)
    scores0 = F.log_softmax(sim, 2)
    scores1 = F.log_softmax(sim.transpose(-1, -2).contiguous(), 2).transpose(-1, -2)

    # Create scores tensor
    scores = sim.new_full((b, m + 1, n + 1), 0)

    # Merge the scores0, scores1, and certainties into scores without slice assignment
    scores_main = scores0 + scores1 + certainties
    scores[:, :m, :n] = scores_main

    # Compute the scores for the last column and row
    last_col_scores = F.logsigmoid(-z0.squeeze(-1)).unsqueeze(2)
    last_row_scores = F.logsigmoid(-z1.squeeze(-1)).unsqueeze(1)

    # Update last column and row in scores
    scores[:, :-1, -1:] = last_col_scores
    scores[:, -1:, :-1] = last_row_scores

    return scores
    ```

I also have a colab notebook with my progress so far:
https://colab.research.google.com/gist/coxep/65ac46a1edc6d262c302efa1813625df/demo.ipynb

Thank you for any assistance :)

Tensorflow warns that variables were used in Lambda layers but are not present in tracked objects

First of all, I want to thank you for this absolutely wonderful converter 🙏. The process to get started, the documentation and the entire approach are just wonderful. Converting the unsupported layers with the clear error messages works really, really well, and the precision evaluation is simply fantastic.

There's one thing I stumble over that I can't quite figure out. When calling keras_model.predict(x) the first time, TensorFlow initializes the model and complains:

WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.linalg.matmul), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(513, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.linalg.matmul), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(513, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_3), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_3), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_2), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_2), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.identity_1), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(1, 27, 1) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.identity_1), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(1, 27, 1) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.math.multiply_51), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(1,) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.math.multiply_51), but are not present in its tracked objects:   <tf.Variable 'weight:0' shape=(1,) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.

Since I'm only running my TF model for inference, this shouldn't make a difference from my understanding. But I'd still love to get rid of these warnings if possible. Do you have any advice on how to fix these?

[Question] Are dynamic axes supported?

Consider the following model graph, - simple MHSA layer
It is, however, depended on the length of the passed sequence.

This code produces the keras model that is capable of running fixed-shape input with fixed length L.
However, when I try to run it with new sequence length, the model forward pass fails with the error

import torch
import torch.nn as nn
import torch.nn.functional as F
from nobuco.convert.converter import pytorch_to_keras
import numpy as np

class MHSA(nn.Module):
    def __init__(self,
            embed_dim,
            out_dim,
            qk_dim,
            v_dim,
            num_head,
        ):
        super().__init__()
        self.embed_dim = embed_dim
        self.num_head  = num_head
        self.qk_dim = qk_dim
        self.v_dim  = v_dim

        self.q = nn.Linear(embed_dim, qk_dim*num_head)
        self.k = nn.Linear(embed_dim, qk_dim*num_head)
        self.v = nn.Linear(embed_dim, v_dim*num_head)
        
        self.out = nn.Linear(v_dim*num_head, out_dim)
        self.scale = 1/(qk_dim**0.5)

    def forward(self, x):
        L,dim = x.shape
        num_head = self.num_head
        qk_dim = self.qk_dim
        v_dim = self.v_dim
        
        q = self.q(x)
        k = self.k(x)
        v = self.v(x)
        q = q.reshape(L, num_head, qk_dim).permute(1,0,2).contiguous()
        k = k.reshape(L, num_head, qk_dim).permute(1,2,0).contiguous()
        v = v.reshape(L, num_head, v_dim ).permute(1,0,2).contiguous()

        dot = q *self.scale @ k  # H L L
        attn = F.softmax(dot, -1)    # L L

        v = torch.matmul(attn, v)  # L H dim
        v = v.permute(1,0,2).reshape(L, v_dim*num_head).contiguous()
        out = self.out(v)

        return out
if __name__ == "__main__":
    emb_dim = 128
    out_dim = 200
    num_heads = 4
    qk_dim = emb_dim // num_heads
    L = 33
    model = MHSA(embed_dim=emb_dim, out_dim=out_dim, qk_dim=qk_dim, v_dim=qk_dim, num_head=num_heads)
    keras_model = pytorch_to_keras(
        model, args=[torch.rand(L, emb_dim)],
    )
    inp = np.random.rand(33, 128)
    print(keras_model(inp)) ## runs fine

    inp = np.random.rand(15, 128)
    keras_model(inp) ## fails

Error:

InvalidArgumentError: Exception encountered when calling layer 'tf.reshape' (type TFOpLambda).

{{function_node __wrapped__Reshape_device_/job:localhost/replica:0/task:0/device:GPU:0}} Input to reshape is a tensor with 1920 values, but the requested shape has 4224 [Op:Reshape]

Call arguments received by layer 'tf.reshape' (type TFOpLambda):
  • tensor=tf.Tensor(shape=(15, 128), dtype=float32)
  • shape=('33', '4', '32')
  • name=None

It looks like that model recorded the static shape of the input and don't support varied-length input, I'm new to keras and I want to ask if there any possible solution?

Convolution layers: Add support for `same` padding.

First of all, thanks for the package! It's amazing that something like this exists.

While trying to convert a model I noticed that using same padding breaks the conversion of convolution layers. Given that same is supported as an argument by both the torch and the tf version of convolution layers, I would assume that this should be a relatively easy fix.

import torch
import torch.nn as nn
import nobuco
from nobuco import ChannelOrder

dummy_image = torch.rand(size=(1, 3, 2048))

for padding in [0, 'same']:
    pytorch_module = nn.Conv1d(3, 10, 15, padding=padding)

    keras_model = nobuco.pytorch_to_keras(
        pytorch_module,
        args=[dummy_image], kwargs=None,
        inputs_channel_order=ChannelOrder.TENSORFLOW,
        outputs_channel_order=ChannelOrder.TENSORFLOW
    )

Convert pretrained .torch model (including weights)

Hi, I am new to your library and would quickly like to know if I could convert a pre-trained torch model to keras. I mean, from what I see in the README, what is converted is the model code, but if we have the weights in the .torch model, how do we put them to the new keras model?

Maybe this question is very obvious, but I don't see the direct way to do it.

Thanks in advance

I am getting 'Unimplemented nodes' exception

I am trying to convert my Detr model but i am getting this error:
Traceback (most recent call last):
File "===/DETR/nobuco_script.py", line 32, in
keras_model = nobuco.pytorch_to_keras(
File "==/DETR/env/lib/python3.10/site-packages/nobuco/convert.py", line 331, in pytorch_to_keras
raise Exception('Unimplemented nodes')
Exception: Unimplemented nodes

using these checkpoints to convert:
CHECKPOINT = "TahaDouaji/detr-doc-table-detection"

pytorch version?

Thanks for your work.
When try to run dynamic_shape.py, I got error info like following:
AttributeError: module 'torch.nn.functional' has no attribute '_canonical_mask'
AttributeError: module 'torch' has no attribute 'fill'
etc...

I think may caused by pytorch version. which version are you using?

Debugging tensor shapes when using dynamic axes

Hello, first of all thank you for this fantastic tool.

I am using it to convert the YOLO-World model from PyTorch to Tensorflow, but I am encountering a strange issue.

The conversion process completes without any issue, but when I change the size of the input tensor, the inferences crashes because of a shape mismatch somewhere in the operation graph.
When I do the same thing with the original PyTorch model, it works, when I convert the model using static input shapes, it works. It is only when using dynamic axis and changing the dimension in question that it fails.

After some investigations, it appears that a tensor has be "doubled" in size along this axis somewhere in the TF graph, for some reason.
I was lucky that the error happened near the end of the graph, in a way that I managed to reduce the conversion process to stop before the operation that crashes because of the shape mismatch. I can then see that some columns are repeated. The issue is that I don't know where the columns are doubled, and thus I can't really find or fix the issue.

As I am not able to create a minimal reproducible example for now, in this issue I am asking for ways to debug the output computational graph, for instance by having a way to get the computed shapes even when using dynamic shapes. Or perhaps a way to insert "fake nodes" in order to step into the graph more easily with the python debugger.

If I can find the source of the issue, I will open another issue for it specifically.

Depthwise convolution conversion fails?

Hi! Looks like a nice framework for pytorch->tflite conversion, however I face conversion issue when I try to convert depthwise convolution to keras, code below.

Env:

slime@slime:~$ pip freeze | grep -E "torch=|nobuco"
nobuco==0.1.1
torch==2.1.0.dev20230317+cu118

Error:

ValueError: Layer depthwise_conv1d weight shape (3, 32, 1) is not compatible with provided weight shape (3, 1, 32).

Reproducible example:

import torch.nn as nn
import torch

from nobuco.convert.converter import pytorch_to_keras
from nobuco.commons import ChannelOrder

class SimpleDepthwiseConvExample(nn.Module):
    def __init__(self, in_channels=32, out_channels=32, kernel_size=3, stride=1, n_groups=32):
        super().__init__()
        self.conv = nn.Conv1d(in_channels, out_channels, kernel_size, stride, kernel_size//2, groups=n_groups)
    
    def forward(self, x):
        return self.conv(x)

if __name__ == '__main__':
    bs = 1
    n_channels = 32
    n_groups = n_channels
    # n_groups = 1
    features = 144
    inputs = [
        torch.rand(bs, n_channels, features)
    ]
    model = SimpleDepthwiseConvExample(
        in_channels=n_channels,
        out_channels=n_channels,
        n_groups=n_groups
    )
    keras_model = pytorch_to_keras(
        model, inputs, inputs_channel_order=ChannelOrder.PYTORCH
    )

I think I was able to fix it in nobuco/converters/impl.py, -
change weights = weights.transpose((2, 1, 0))
to weights = weights.transpose((2, 0, 1)), after the change checker seems to pass.

Can you confirm that the error is on the side of the library?

nn.Conv2d and F.conv2d with groups == input_channels (DepthWise) generates PartitionedCall in tensorflow frozen_graph

When I define a define a model with DepthWise convolutions (groups == input_channels) the model is converted sucessfully but the tensorflow frozen_graph of this model cannot be converted to tensorflow.js. The problem is that keras.layers.Conv2D generates a PartitionedCall in the frozen_graph that cannot be converted to tensorflow.js.

I provide the python code to reproduce the problem:

import torch.nn as nn
import torch

from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2


import tensorflow as tf
import nobuco
from nobuco.commons import ChannelOrder, ChannelOrderingStrategy

class ExampleModel(nn.Module):
    def __init__(self, 
                **kwargs):
        
        super(ExampleModel, self).__init__()
        self.layer1 = nn.Conv2d(16, 16, (3,3), (1,1), (0,0), (1,1), 16)
        self.layer2 = nn.ReLU()


    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        return x


model = ExampleModel()

# Put model in inference mode
model.eval()

x = torch.randn(1, 16, 113, 113, requires_grad=False)

keras_model = nobuco.pytorch_to_keras(
    model,
    args=[x], kwargs=None)

# Assuming 'model' is your Keras model
full_model = tf.function(lambda x: keras_model(x))
full_model = full_model.get_concrete_function(
    tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype))

# Convert Keras model to frozen ConcreteFunction
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()

# Print the input and output tensors
print("Frozen model inputs: ", frozen_func.inputs)
print("Frozen model outputs: ", frozen_func.outputs)

# Save frozen graph to disk
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
                logdir='.',
                name='ExampleModel.pb',
                as_text=False)

Inspecting the ExampleModel.pb with Netron this is what happens:

Screenshot from 2024-03-26 14-17-34

In order to fix this error, I made a custom nn.Conv2d converter:

@nobuco.converter(nn.Conv2d)
def converter_Conv2d(self, input: Tensor):
    weight = self.weight
    bias = self.bias
    groups = self.groups
    padding = self.padding
    stride = self.stride
    dilation = self.dilation
    
    

    out_filters, in_filters, kh, kw = weight.shape

    weights = weight.cpu().detach().numpy()

    if groups == 1:
        weights = tf.transpose(weights, (2, 3, 1, 0))
    else:
        weights = tf.transpose(weights, (2, 3, 0, 1))

    if bias is not None:
        biases = bias.cpu().detach().numpy()
        params = [weights, biases]
        use_bias = True
    else:
        params = [weights]
        use_bias = False

    if isinstance(dilation, numbers.Number):
        dilation = (dilation, dilation)

    if isinstance(padding, numbers.Number):
        padding = (padding, padding)

    pad_str = 'valid'
    pad_layer = None

    if padding == 'same':
        pad_str = 'same'
    elif padding != (0, 0):
        pad_layer = keras.layers.ZeroPadding2D(padding)

    if groups == 1:
        conv = keras.layers.Conv2D(filters=out_filters,
                                kernel_size=(kh, kw),
                                strides=stride,
                                padding=pad_str,
                                dilation_rate=dilation,
                                groups=groups,
                                use_bias=use_bias,
                                weights=params
                                )
    else:
        conv = keras.layers.DepthwiseConv2D(
                    kernel_size=(kh, kw),
                    strides=stride,
                    padding=pad_str,
                    use_bias=use_bias,
                    activation=None,
                    depth_multiplier=1,
                    weights=params,
                    dilation_rate=dilation,
                )

    def func(input):
        if pad_layer is not None:
            input = pad_layer(input)
        output = conv(input)
        return output
    return func

But I think that probably is better to fix this in the source code.

Much slower training after convert torch code to tf code

First this is an amazing project, I found it could convert and infer very well.
Then I tried to convert and train using tf, it could work but seems keras model still show static batch size shape, like 1 or 128 depending on my dummy input for convert. I could train by set trace_shape=True, if not set will fail.
And the training process could work but much slower then torch or tf orginal code running.
Could you help give some suggestions if I could speed up the training?

Custom initializer for tf weights

Hello there!

I tried to convert model from pytorch to keras and got such thing:
An initializer for variable weight of type <dtype: 'complex64'> is required for layer weight_layer. Received: None.
So this is common tf problem. And i see

const_layer = WeightLayer(weight.shape, weight.dtype)

self.weight = self.add_weight('weight', shape=weight_shape, dtype=weight_dtype)
here.
We don't have any possibility to affect on weight creation.

Could we somehow make possible to push code for custom initializer like this:

def complex_initializer(base_initializer):
    f = base_initializer()

    def initializer(*args, dtype=tf.complex64, **kwargs):
        real = f(*args, **kwargs)
        imag = f(*args, **kwargs)
        return tf.complex(real, imag)

    return initializer
    
initializer=complex_initializer(tf.random_normal_initializer))

in nobuco.pytorch_to_keras(..., tf_weight_initializer=initializer, ...) call?

self.add_weight('weight', shape=weight_shape, dtype=weight_dtype, initializer=tf_weight_initializer)

Kind regards

ModuleNotFoundError: No module named 'keras.src.engine'

Traceback (most recent call last):
  File "/p/i/proj/waternet_tf.py", line 5, in <module>
    import nobuco
  File "/p/i/proj/.venv/lib/python3.12/site-packages/nobuco/__init__.py", line 1, in <module>
    from nobuco.converters.channel_ordering import t_pytorch2keras, t_keras2pytorch
  File "/p/i/proj/.venv/lib/python3.12/site-packages/nobuco/converters/channel_ordering.py", line 6, in <module>
    from nobuco.commons import ChannelOrder, TF_TENSOR_CLASSES
  File "/p/i/proj/.venv/lib/python3.12/site-packages/nobuco/commons.py", line 3, in <module>
    from keras.src.engine.keras_tensor import KerasTensor
ModuleNotFoundError: No module named 'keras.src.engine'

Both tensorflow and keras installed.

Python 3.12.2
nobuco 0.12.0
keras 3.0.5
tensorflow 2.16.0rc0
torch 2.2.0

Custom Softplus Layer and Einsum Error when load_model with Keras

Hi! Thanks so much for the amazing conversion tool! The readme doc is so well written 🎉

I am not an expert at all in Keras/Tensorflow, but I'm trying to convert a model from pytorch to tf that is using a softplus activation. I've read in tf docs that there is a slightly different implementation from the pytorch one, but I want to give it a try anyway. So I implemented a converter as you suggest:

@nobuco.converter(F.softplus, channel_ordering_strategy=ChannelOrderingStrategy.MINIMUM_TRANSPOSITIONS)
def converter_softplus(input):
    def func(input):
        return tf.keras.activations.softplus(input)
    return func

with this custom converter, pytorch_to_keras(...) goes straight green on the whole model. Then I save the model with keras_model.save(f'{export_name}.h5').

So far so good!

Now I want to test the model by making an inference, so I use the following code:

# prompt
dummy_prompt_keras = "Harry Potter"
input_ids_keras = tokenizer(dummy_prompt_keras, return_tensors='tf').input_ids

# loading model
keras_model = tf.keras.models.load_model(f'{export_name}.h5')
#keras_model.summary()

# inference
out = keras_model.predict(input_ids_keras)

# output
print(out)

but I get an error on the line:

keras_model = tf.keras.models.load_model(f'{export_name}.h5')

the error stacktrace is:

"stack": "---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
...
...
File ~/miniconda3/envs/deepl/lib/python3.11/site-packages/nobuco/node_converters/linear.py:102, in converter_einsum.<locals>.func.<locals>.<lambda>(operands)
    100 equation = args[0]
    101 operands = args[1:]
--> 102 return keras.layers.Lambda(lambda operands: tf.einsum(equation, *operands))(operands)

AttributeError: Exception encountered when calling layer \"lambda_12\" (type Lambda).

'list' object has no attribute 'shape'

Call arguments received by layer \"lambda_12\" (type Lambda):
  • inputs=['tf.Tensor(shape=(None, None, 10), dtype=float32)', [['tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', ... ]]]
  • mask=None
  • training=None"
}

lambda_12 layer relates to an einsum operation by einops, which uses a torch.einsum under the hood I guess. I also tested with native torch.einsum but same error. So, my custom softplus layer is supposed to output a Tensor, which is then picked up by the einsum layer. However, this does not seem to be the case, as indicated by the stacktrace.

I'm using the following code for conversion:

keras_model = nobuco.pytorch_to_keras(
    model,
    args=[input_ids], kwargs=None,
    input_shapes={input_ids: (None, None)},
    inputs_channel_order=ChannelOrder.TENSORFLOW,
    outputs_channel_order=ChannelOrder.TENSORFLOW,
    constants_to_variables=False,
    trace_shape=True,
)

This is the libs version I'm using:

nobuco                       0.11.5
keras                        2.15.0
torch                        2.2.0+cu118
tensorflow                   2.15.0.post1

Do you have any idea or any tips to get along with this error? I'm sure there is some dumb mistake I made, but I'm struggling to spot it. If you could help It would be very much appreciated.

Thanks again for the amazing work on this project!

LogSoftMax conversion

Hi,

I am getting the following error due to the use of torch.nn.functional.log_softmax.
So I wrote the following conversion for it:

@nobuco.converter(torch.nn.functional.log_softmax, channel_ordering_strategy=nobuco.ChannelOrderingStrategy.MINIMUM_TRANSPOSITIONS)
def log_softmax(input: torch.Tensor, dim: int = -1):
    return lambda input, dim=dim: tf.nn.log_softmax(input, axis=dim)

But I am still getting an error from the nobuco converter:
Exception: Failed conversion: <function log_softmax at 0x7fa24fdf9820>

This is the class where it fails at:

class OutBlock(nn.Module):
    def __init__(self):
        super(OutBlock, self).__init__()
        self.conv = nn.Conv2d(256, 1, kernel_size=1)  # value head
        self.bn = nn.BatchNorm2d(1)
        self.fc1 = nn.Linear(10 * 10, 128)
        self.fc2 = nn.Linear(128, 1)

        self.conv1 = nn.Conv2d(256, 128, kernel_size=1)  # policy head
        self.bn1 = nn.BatchNorm2d(128)
        self.logsoftmax = nn.LogSoftmax(dim=1)
        self.fc = nn.Linear(10 * 10 * 128, 10 * 10 * 36)

    def forward(self, s):
        v = F.relu(self.bn(self.conv(s)))  # value head
        v = v.view(-1, 10 * 10)  # batch_size X channel X height X width
        v = F.relu(self.fc1(v))
        v = torch.tanh(self.fc2(v))

        p = F.relu(self.bn1(self.conv1(s)))  # policy head
        p = p.view(-1, 10 * 10 * 128)
        p = self.fc(p)
        p = self.logsoftmax(p).exp()
        return p, v

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.