alexanderlutsenko / nobuco Goto Github PK
View Code? Open in Web Editor NEWPytorch to Keras/Tensorflow conversion made intuitive
License: MIT License
Pytorch to Keras/Tensorflow conversion made intuitive
License: MIT License
Do we have any instruction to llama 2 conversion from pytorch to tensorflow?
Hey @AlexanderLutsenko,
my apologies for bugging you so soon again after you resolved my other request.
I noticed, that when I'm converting a nn.TransformerEncoderLayer
, the parameter counts are mismatched, despite the fact that the conversion proceeds without issue (all green according to nobuco).
The problem seems to come from the linear1
which for some reason, doesn't get constructed with the right dimensions (or as a normal Dense
layer). The correct number of parameters would be 128 x 256 + 256 = 33'024
. However, the resulting tensorflow model seems to construct a layer of size 512 x 256 = 131'072
.
torchinfo
's summary:
==========================================================================================
Layer (type:depth-idx) Output Shape Param #
==========================================================================================
TransformerEncoderLayer [1, 512, 128] --
├─MultiheadAttention: 1-1 [1, 512, 128] 66,048
├─Dropout: 1-2 [1, 512, 128] --
├─LayerNorm: 1-3 [1, 512, 128] 256
├─Linear: 1-4 [1, 512, 256] 33,024
├─Dropout: 1-5 [1, 512, 256] --
├─Linear: 1-6 [1, 512, 128] 32,896
├─Dropout: 1-7 [1, 512, 128] --
├─LayerNorm: 1-8 [1, 512, 128] 256
==========================================================================================
Total params: 132,480
Trainable params: 132,480
Non-trainable params: 0
Total mult-adds (M): 0.07
==========================================================================================
Input size (MB): 0.26
Forward/backward pass size (MB): 2.62
Params size (MB): 0.27
Estimated Total Size (MB): 3.15
==========================================================================================
Keras's summary:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(1, 128, 512)] 0 []
tf.compat.v1.transpose_2 ( (1, 512, 128) 0 ['input_1[0][0]']
TFOpLambda)
multi_head_attention_2 (Mu (1, 512, 128) 66048 ['tf.compat.v1.transpose_2[0][
ltiHeadAttention) 0]',
'tf.compat.v1.transpose_2[0][
0]',
'tf.compat.v1.transpose_2[0][
0]']
dropout_1 (Dropout) (1, 512, 128) 0 ['multi_head_attention_2[1][0]
']
tf.compat.v1.transpose_3 ( (1, 128, 512) 0 ['dropout_1[0][0]']
TFOpLambda)
weight_layer_4 (WeightLaye (1, 512, 256) 131072 ['input_1[0][0]']
r)
tf.__operators__.add (TFOp (1, 128, 512) 0 ['input_1[0][0]',
Lambda) 'tf.compat.v1.transpose_3[0][
0]']
dropout_2 (Dropout) (1, 512, 256) 0 ['weight_layer_4[0][0]']
tf.compat.v1.transpose_4 ( (1, 512, 128) 0 ['tf.__operators__.add[0][0]']
TFOpLambda)
dense_1 (Dense) (1, 512, 128) 32896 ['dropout_2[0][0]']
layer_normalization (Layer (1, 512, 128) 256 ['tf.compat.v1.transpose_4[0][
Normalization) 0]']
dropout_3 (Dropout) (1, 512, 128) 0 ['dense_1[0][0]']
tf.__operators__.add_1 (TF (1, 512, 128) 0 ['layer_normalization[0][0]',
OpLambda) 'dropout_3[0][0]']
layer_normalization_1 (Lay (1, 512, 128) 256 ['tf.__operators__.add_1[0][0]
erNormalization) ']
tf.compat.v1.transpose_5 ( (1, 128, 512) 0 ['layer_normalization_1[0][0]'
TFOpLambda) ]
tf.identity (TFOpLambda) (1, 128, 512) 0 ['tf.compat.v1.transpose_5[0][
0]']
==================================================================================================
Total params: 230528 (900.50 KB)
Trainable params: 230528 (900.50 KB)
Non-trainable params: 0 (0.00 Byte)
__________________________________________________________________________________________________
To reproduce:
import torch
import torch.nn as nn
import nobuco
from nobuco import ChannelOrder
from torchinfo import summary
pytorch_module = nn.TransformerEncoderLayer(128, 4, dim_feedforward=256, batch_first=True).eval()
#pytorch_module = nn.TransformerEncoderLayer(128, 4, dim_feedforward=256, batch_first=True).linear1.eval()
dummy_image = torch.rand(size=(1, 512, 128))
print(pytorch_module(dummy_image).mean())
print(summary(pytorch_module, dummy_image.shape))
keras_model = nobuco.pytorch_to_keras(
pytorch_module,
args=[dummy_image], kwargs=None,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW
)
print(keras_model.summary())
Any idea why this could happen?
Thanks for your very clear insight and support on my repository issue. 😄
I just wanted to post this issue to say thanks. Sorry if this is a nuisance to you.
The purpose of the repository (onnx2tf) I was creating was to generate TensorFlow models from PyTorch, but your tool is much more complete. The source code is very clean and I am very impressed.
So, I have a question: I will of course consider issuing a pull request to this repository someday, but may I use your clean source code as a reference and quote your implementation to my repository? Since the overall design of the tools is very different, it is not possible to quote them in exactly the same way, but clean OP as conversion patterns are very helpful. For example, implementations of While-Loop
and GridSample
.
However, I would like to contribute to this repository someday, as I cannot rely on you all the time.
I am sorry for taking up so much of your valuable time.
Again, thanks.
First of all, thank you for developing this great project!
I found an event that failed in the convolution conversion when its padding argument is 'valid'(nn.Conv
) or default(F.conv
).
UserWarning: Conversion exception on node 'Conv1d': The `padding` argument must be a tuple of 2 integers. Received: v
raise Exception(f'Failed conversion: {self.original_node}')
Exception: Failed conversion: Conv1d(128, 128, kernel_size=(1,), stride=(1,), padding=valid)
UserWarning: Validation exception on node 'conv1d': Failed conversion: <built-in method conv1d of type object at 0x105101780>
Exception: Failed conversion: <built-in method conv1d of type object at 0x122109780>
ValueError: `padding` should have two elements. Received: valid.
raise Exception(f'Failed conversion: {self.original_node}')
Exception: Failed conversion: Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), padding=valid)
UserWarning: Validation exception on node 'ErrorModel': Failed conversion: <built-in method conv2d of type object at 0x122101980>
Exception: Failed conversion: <built-in method conv2d of type object at 0x122101510>
The code to be reproduced is as follows:
import nobuco
import torch
import torch.nn as nn
import torch.nn.functional as F
from nobuco import ChannelOrder
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv0_weight = nn.Parameter(torch.randn(128, 3, 3, 3))
self.conv0_bias = nn.Parameter(torch.randn(128))
self.conv1 = nn.Conv2d(128, 128, 1, 1, padding=0)
self.conv2_weight = nn.Parameter(torch.randn(128, 128, 3))
self.conv2_bias = nn.Parameter(torch.randn(128))
self.conv3 = nn.Conv1d(128, 128, 1, 1, padding=0)
def forward(self, x):
x = F.conv2d(x, self.conv0_weight, self.conv0_bias, padding="same")
x = F.relu(x)
x = self.conv1(x)
x = F.relu(x)
x = x.reshape(x.shape[0], x.shape[1], -1)
x = F.conv1d(x, self.conv2_weight, self.conv2_bias, padding="same")
x = F.relu(x)
x = self.conv3(x)
x = F.relu(x)
return x
class ErrorModel(nn.Module):
def __init__(self):
super().__init__()
self.conv0_weight = nn.Parameter(torch.randn(128, 3, 3, 3))
self.conv0_bias = nn.Parameter(torch.randn(128))
self.conv1 = nn.Conv2d(128, 128, 1, 1, "valid")
self.conv2_weight = nn.Parameter(torch.randn(128, 128, 3))
self.conv2_bias = nn.Parameter(torch.randn(128))
self.conv3 = nn.Conv1d(128, 128, 1, 1, "valid")
def forward(self, x):
x = F.conv2d(x, self.conv0_weight, self.conv0_bias)
x = F.relu(x)
x = self.conv1(x)
x = F.relu(x)
x = x.reshape(x.shape[0], x.shape[1], -1)
x = F.conv1d(x, self.conv2_weight, self.conv2_bias)
x = F.relu(x)
x = self.conv3(x)
x = F.relu(x)
return x
def main():
dummy_image = torch.rand(size=(1, 3, 64, 64))
model = Model().eval()
_ = nobuco.pytorch_to_keras(
model,
args=[dummy_image],
kwargs=None,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW,
)
error_model = ErrorModel().eval()
_ = nobuco.pytorch_to_keras(
error_model,
args=[dummy_image],
kwargs=None,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW,
) # This will raise an error
if __name__ == "__main__":
main()
Hi! your library is amazing, thank you so much!
I'm trying to convert this LLM https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b from pytorch to tensorflow, as usual the input is dynamic, when I run:
keras_model = nobuco.pytorch_to_keras(
model,
args=[padded_input], kwargs=None,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW
)
the conversion works flawlessly and the resulting keras model produces the same result as the original model, except that the input size of the keras model is fixed to whatever the size of the padded_input was.
If instead I run the conversion like so:
keras_model = nobuco.pytorch_to_keras(
model,
args=[padded_input], kwargs=None,
input_shapes={padded_input: (1, None)},
trace_shape=True,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW
)
then it crashes towards the end of the conversion with error:
TypeError: Keras symbolic inputs/outputs do not implement __len__
. You may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model. This error will also get raised if you try asserting a symbolic input/output directly.
Any pointers of what the problem might be?
I am trying to create a Keras_yolov5s.h5 file using nobuco/example/yolo5.py and then convert it to spiking neural networks. At this point, the evaluation of keras_yolov5s.h5 is necessary. When I try to run the example, it seems like it doesn't compile. How can I solve this issue?
Conversion complete. Elapsed time: 4.07 sec.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. model.compile_metrics
will be empty until you train or evaluate the model.
Model saved
WARNING:tensorflow:No training configuration found in the save file, so the model was not compiled. Compile it manually.
Model loaded
This code has error
x = x.mean(0,keepdims=True)
File "/home/titanx/hengck/opt/anaconda3.9/lib/python3.9/site-packages/nobuco/converters/node_converter.py", line 48, in decorator
converter_result_func = converter_func(*args, **kwargs)
TypeError: converter_mean() got an unexpected keyword argument 'keepdims'
But i can get rid of it using:
x = x.mean(0).unsqueeze(0)
Hi!
First of all, thank you so much for this work! Hats off :)
Then, I noticed that at the nobuco/node_converters/convolution.py
there is the:
is_depthwise = groups == in_filters
which then is used to select the keras.layers.DepthwiseConvXD
version of the ConvXD layers in Keras. Though, this creates problems when the stride is not same in height and width. For example, in PyTorch one can do:
import torch
torch.nn.Conv2D(
in_channels=4,
out_channels=4,
kernel_size=(1, 3),
stride=(1, 2),
groups=4,
)
but the above, and because of the is_depthwise = groups == in_filters
, will be translated to
import keras, numpy as np
x = np.random.rand(4, 10, 10, 12)
keras.layers.DepthwiseConv2D(
(1, 3),
strides=(1, 2),
)(x)
Which then will give:
InvalidArgumentError: Exception encountered when calling layer 'depthwise_conv2d_1' (type DepthwiseConv2D).
{{function_node __wrapped__DepthwiseConv2dNative_device_/job:localhost/replica:0/task:0/device:CPU:0}} Current implementation only supports equal length strides in the row and column dimensions. [Op:DepthwiseConv2dNative] name:
Call arguments received by layer 'depthwise_conv2d_1' (type DepthwiseConv2D):
• inputs=tf.Tensor(shape=(4, 10, 10, 12), dtype=float32)
But, if typical Conv2D is used instead of DepthwiseConv2D, like:
import keras, numpy as np
x = np.random.rand(4, 10, 10, 12)
keras.layers.Conv2D(
(1, 3),
strides=(1, 2),
groups=12
)(x)
then it will work.
I have quit using keras since before Theano was deprecated (was that 2017?) and I'm not familiar what is the actual difference between keras.layers.DepthwiseConv2D
and keras.layers.Conv2D
. But if it is just about the setting of the groups, then maybe either add one more checking (e.g. is_stride_ok = all([stride[0] == x for x in stride])
) or maybe not use keras.layers.DepthwiseConv
?
Any insights?
Thank you!
First of all thank you for your amazing job on this project, it comes so handy for my projects.
Secondly, I found out that current default math conversion operations like sum
, mean
, etc. does not support multidimensional aggregation. For example, if pytorch code contains something like my_tensor.mean((2,3))
, nobuco throws error during conversion.
TypeError: '<' not supported between instances of 'tuple' and 'int'
Code to reproduce:
class DummyModel(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return x.mean((2,3), keepdim=True)
model = DummyModel()
dummy_image = torch.randn(1, 3, 100, 100)
keras_model = nobuco.pytorch_to_keras(
model,
args=[dummy_image]
)
Hi,
the str.removesuffix comes with python 3.9 and add unecessary version constraint.
Please consider switching:
all = [basename(f).removesuffix('.py') for f in modules if isfile(f) and not f.endswith('init.py')]
to:
all = [basename(f)[:-3] if f.endswith('.py') else basename(f) for f in modules if isfile(f) and not f.endswith('init.py')]
In addition the library is not intuitevly compatible with all torch versions (linalg comes starting > 1.17 and _six is depreceated for newer torch versions). it would be great if the user get an explicit requirements txt and add an instruction to installing the library as developper mode (-e . ).
Best regards
Rayen
(Edit)
Hello, I am working to convert LightGlue to Tensorflow (I ultimately want to get to TFlite) using nobuco + some help from ChatGPT to create some of the conversion functions ;)
I am still in the process of performing the conversion, but had a question. I'm seeing very imprecise conversion, and am not sure why this would be the case. I'm trying to rule out any issues in my implementation.
Here is the conversion function I am using for log_softmax:
@converter(torch.nn.functional.log_softmax, channel_ordering_strategy=ChannelOrderingStrategy.MINIMUM_TRANSPOSITIONS)
def converter_log_softmax(input, dim, dtype=None):
def func(input, dim, dtype=None):
# Adjust 'dim' if it's negative to handle PyTorch's negative indexing
if dim < 0:
dim += len(input.shape)
# Apply TensorFlow's log_softmax
# If dtype is specified, cast the input tensor to this dtype first
if dtype is not None:
input = tf.cast(input, dtype)
return tf.nn.log_softmax(input, axis=dim)
return func
Nobuco is indicating a significant discrepancy for log_softmax in the log:
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<class 'lightglue.lightglue.TransformerLayer'>|LightGlue] conversion procedure might be incorrect: max. discrepancy for output #1 is 0.00010 (0.004%)
warnings.warn(warn_string, category=RuntimeWarning)
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<class 'lightglue.lightglue.TransformerLayer'>|LightGlue] conversion procedure might be incorrect: max. discrepancy for output #0 is 0.00012 (0.005%)
warnings.warn(warn_string, category=RuntimeWarning)
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<function log_softmax at 0x7bc0c800de10>|LightGlue->MatchAssignment] conversion procedure might be incorrect: max. discrepancy for output #0 is 38.75780 (103.477%)
warnings.warn(warn_string, category=RuntimeWarning)
/usr/local/lib/python3.10/dist-packages/nobuco/converters/validation.py:55: RuntimeWarning: [<class 'lightglue.lightglue.MatchAssignment'>|LightGlue] conversion procedure might be incorrect: max. discrepancy for output #0 is 38.75780 (43.686%)
warnings.warn(warn_string, category=RuntimeWarning)
Here is the code snippet calling log_softmax:
# Original implementation
# def sigmoid_log_double_softmax(
# sim: torch.Tensor, z0: torch.Tensor, z1: torch.Tensor
# ) -> torch.Tensor:
# """create the log assignment matrix from logits and similarity"""
# b, m, n = sim.shape
# certainties = F.logsigmoid(z0) + F.logsigmoid(z1).transpose(1, 2)
# scores0 = F.log_softmax(sim, 2)
# scores1 = F.log_softmax(sim.transpose(-1, -2).contiguous(), 2).transpose(-1, -2)
# scores = sim.new_full((b, m + 1, n + 1), 0)
# scores[:, :m, :n] = scores0 + scores1 + certainties
# scores[:, :-1, -1] = F.logsigmoid(-z0.squeeze(-1))
# scores[:, -1, :-1] = F.logsigmoid(-z1.squeeze(-1))
# return scores
# My implementation with some modifications to eliminate slicing
def sigmoid_log_double_softmax(sim: torch.Tensor, z0: torch.Tensor, z1: torch.Tensor) -> torch.Tensor:
"""create the log assignment matrix from logits and similarity"""
b, m, n = sim.shape
# Calculate certainties and scores0, scores1 as before
certainties = F.logsigmoid(z0) + F.logsigmoid(z1).transpose(1, 2)
scores0 = F.log_softmax(sim, 2)
scores1 = F.log_softmax(sim.transpose(-1, -2).contiguous(), 2).transpose(-1, -2)
# Create scores tensor
scores = sim.new_full((b, m + 1, n + 1), 0)
# Merge the scores0, scores1, and certainties into scores without slice assignment
scores_main = scores0 + scores1 + certainties
scores[:, :m, :n] = scores_main
# Compute the scores for the last column and row
last_col_scores = F.logsigmoid(-z0.squeeze(-1)).unsqueeze(2)
last_row_scores = F.logsigmoid(-z1.squeeze(-1)).unsqueeze(1)
# Update last column and row in scores
scores[:, :-1, -1:] = last_col_scores
scores[:, -1:, :-1] = last_row_scores
return scores
```
I also have a colab notebook with my progress so far:
https://colab.research.google.com/gist/coxep/65ac46a1edc6d262c302efa1813625df/demo.ipynb
Thank you for any assistance :)
First of all, I want to thank you for this absolutely wonderful converter 🙏. The process to get started, the documentation and the entire approach are just wonderful. Converting the unsupported layers with the clear error messages works really, really well, and the precision evaluation is simply fantastic.
There's one thing I stumble over that I can't quite figure out. When calling keras_model.predict(x)
the first time, TensorFlow initializes the model and complains:
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.linalg.matmul), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(513, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.linalg.matmul), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(513, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_3), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_3), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_2), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.reshape_2), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(2, 1, 128) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.identity_1), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(1, 27, 1) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.identity_1), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(1, 27, 1) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.math.multiply_51), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(1,) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
WARNING:tensorflow:The following Variables were used in a Lambda layer's call (tf.math.multiply_51), but are not present in its tracked objects: <tf.Variable 'weight:0' shape=(1,) dtype=float32>. This is a strong indication that the Lambda layer should be rewritten as a subclassed Layer.
Since I'm only running my TF model for inference, this shouldn't make a difference from my understanding. But I'd still love to get rid of these warnings if possible. Do you have any advice on how to fix these?
Consider the following model graph, - simple MHSA layer
It is, however, depended on the length of the passed sequence.
This code produces the keras model that is capable of running fixed-shape input with fixed length L.
However, when I try to run it with new sequence length, the model forward pass fails with the error
import torch
import torch.nn as nn
import torch.nn.functional as F
from nobuco.convert.converter import pytorch_to_keras
import numpy as np
class MHSA(nn.Module):
def __init__(self,
embed_dim,
out_dim,
qk_dim,
v_dim,
num_head,
):
super().__init__()
self.embed_dim = embed_dim
self.num_head = num_head
self.qk_dim = qk_dim
self.v_dim = v_dim
self.q = nn.Linear(embed_dim, qk_dim*num_head)
self.k = nn.Linear(embed_dim, qk_dim*num_head)
self.v = nn.Linear(embed_dim, v_dim*num_head)
self.out = nn.Linear(v_dim*num_head, out_dim)
self.scale = 1/(qk_dim**0.5)
def forward(self, x):
L,dim = x.shape
num_head = self.num_head
qk_dim = self.qk_dim
v_dim = self.v_dim
q = self.q(x)
k = self.k(x)
v = self.v(x)
q = q.reshape(L, num_head, qk_dim).permute(1,0,2).contiguous()
k = k.reshape(L, num_head, qk_dim).permute(1,2,0).contiguous()
v = v.reshape(L, num_head, v_dim ).permute(1,0,2).contiguous()
dot = q *self.scale @ k # H L L
attn = F.softmax(dot, -1) # L L
v = torch.matmul(attn, v) # L H dim
v = v.permute(1,0,2).reshape(L, v_dim*num_head).contiguous()
out = self.out(v)
return out
if __name__ == "__main__":
emb_dim = 128
out_dim = 200
num_heads = 4
qk_dim = emb_dim // num_heads
L = 33
model = MHSA(embed_dim=emb_dim, out_dim=out_dim, qk_dim=qk_dim, v_dim=qk_dim, num_head=num_heads)
keras_model = pytorch_to_keras(
model, args=[torch.rand(L, emb_dim)],
)
inp = np.random.rand(33, 128)
print(keras_model(inp)) ## runs fine
inp = np.random.rand(15, 128)
keras_model(inp) ## fails
Error:
InvalidArgumentError: Exception encountered when calling layer 'tf.reshape' (type TFOpLambda).
{{function_node __wrapped__Reshape_device_/job:localhost/replica:0/task:0/device:GPU:0}} Input to reshape is a tensor with 1920 values, but the requested shape has 4224 [Op:Reshape]
Call arguments received by layer 'tf.reshape' (type TFOpLambda):
• tensor=tf.Tensor(shape=(15, 128), dtype=float32)
• shape=('33', '4', '32')
• name=None
It looks like that model recorded the static shape of the input and don't support varied-length input, I'm new to keras and I want to ask if there any possible solution?
First of all, thanks for the package! It's amazing that something like this exists.
While trying to convert a model I noticed that using same
padding breaks the conversion of convolution layers. Given that same
is supported as an argument by both the torch and the tf version of convolution layers, I would assume that this should be a relatively easy fix.
import torch
import torch.nn as nn
import nobuco
from nobuco import ChannelOrder
dummy_image = torch.rand(size=(1, 3, 2048))
for padding in [0, 'same']:
pytorch_module = nn.Conv1d(3, 10, 15, padding=padding)
keras_model = nobuco.pytorch_to_keras(
pytorch_module,
args=[dummy_image], kwargs=None,
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW
)
Hi, I am new to your library and would quickly like to know if I could convert a pre-trained torch model to keras. I mean, from what I see in the README, what is converted is the model code, but if we have the weights in the .torch model, how do we put them to the new keras model?
Maybe this question is very obvious, but I don't see the direct way to do it.
Thanks in advance
I am trying to convert my Detr model but i am getting this error:
Traceback (most recent call last):
File "===/DETR/nobuco_script.py", line 32, in
keras_model = nobuco.pytorch_to_keras(
File "==/DETR/env/lib/python3.10/site-packages/nobuco/convert.py", line 331, in pytorch_to_keras
raise Exception('Unimplemented nodes')
Exception: Unimplemented nodes
using these checkpoints to convert:
CHECKPOINT = "TahaDouaji/detr-doc-table-detection"
I really appreciate your great work.
Secondly, I found out that current conversion does not support torch.repeat_interleave .
It seems that tf.repeat match repeat_interleave.
I would be grateful if repeat_interleave conversion would be supported officially.
Thanks for your work.
When try to run dynamic_shape.py, I got error info like following:
AttributeError: module 'torch.nn.functional' has no attribute '_canonical_mask'
AttributeError: module 'torch' has no attribute 'fill'
etc...
I think may caused by pytorch version. which version are you using?
Hello, first of all thank you for this fantastic tool.
I am using it to convert the YOLO-World model from PyTorch to Tensorflow, but I am encountering a strange issue.
The conversion process completes without any issue, but when I change the size of the input tensor, the inferences crashes because of a shape mismatch somewhere in the operation graph.
When I do the same thing with the original PyTorch model, it works, when I convert the model using static input shapes, it works. It is only when using dynamic axis and changing the dimension in question that it fails.
After some investigations, it appears that a tensor has be "doubled" in size along this axis somewhere in the TF graph, for some reason.
I was lucky that the error happened near the end of the graph, in a way that I managed to reduce the conversion process to stop before the operation that crashes because of the shape mismatch. I can then see that some columns are repeated. The issue is that I don't know where the columns are doubled, and thus I can't really find or fix the issue.
As I am not able to create a minimal reproducible example for now, in this issue I am asking for ways to debug the output computational graph, for instance by having a way to get the computed shapes even when using dynamic shapes. Or perhaps a way to insert "fake nodes" in order to step into the graph more easily with the python debugger.
If I can find the source of the issue, I will open another issue for it specifically.
Hi! Looks like a nice framework for pytorch->tflite conversion, however I face conversion issue when I try to convert depthwise convolution to keras, code below.
Env:
slime@slime:~$ pip freeze | grep -E "torch=|nobuco"
nobuco==0.1.1
torch==2.1.0.dev20230317+cu118
Error:
ValueError: Layer depthwise_conv1d weight shape (3, 32, 1) is not compatible with provided weight shape (3, 1, 32).
Reproducible example:
import torch.nn as nn
import torch
from nobuco.convert.converter import pytorch_to_keras
from nobuco.commons import ChannelOrder
class SimpleDepthwiseConvExample(nn.Module):
def __init__(self, in_channels=32, out_channels=32, kernel_size=3, stride=1, n_groups=32):
super().__init__()
self.conv = nn.Conv1d(in_channels, out_channels, kernel_size, stride, kernel_size//2, groups=n_groups)
def forward(self, x):
return self.conv(x)
if __name__ == '__main__':
bs = 1
n_channels = 32
n_groups = n_channels
# n_groups = 1
features = 144
inputs = [
torch.rand(bs, n_channels, features)
]
model = SimpleDepthwiseConvExample(
in_channels=n_channels,
out_channels=n_channels,
n_groups=n_groups
)
keras_model = pytorch_to_keras(
model, inputs, inputs_channel_order=ChannelOrder.PYTORCH
)
I think I was able to fix it in nobuco/converters/impl.py
, -
change weights = weights.transpose((2, 1, 0))
to weights = weights.transpose((2, 0, 1))
, after the change checker seems to pass.
Can you confirm that the error is on the side of the library?
When I define a define a model with DepthWise convolutions (groups == input_channels) the model is converted sucessfully but the tensorflow frozen_graph of this model cannot be converted to tensorflow.js. The problem is that keras.layers.Conv2D
generates a PartitionedCall in the frozen_graph that cannot be converted to tensorflow.js.
I provide the python code to reproduce the problem:
import torch.nn as nn
import torch
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
import tensorflow as tf
import nobuco
from nobuco.commons import ChannelOrder, ChannelOrderingStrategy
class ExampleModel(nn.Module):
def __init__(self,
**kwargs):
super(ExampleModel, self).__init__()
self.layer1 = nn.Conv2d(16, 16, (3,3), (1,1), (0,0), (1,1), 16)
self.layer2 = nn.ReLU()
def forward(self, x):
x = self.layer1(x)
x = self.layer2(x)
return x
model = ExampleModel()
# Put model in inference mode
model.eval()
x = torch.randn(1, 16, 113, 113, requires_grad=False)
keras_model = nobuco.pytorch_to_keras(
model,
args=[x], kwargs=None)
# Assuming 'model' is your Keras model
full_model = tf.function(lambda x: keras_model(x))
full_model = full_model.get_concrete_function(
tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype))
# Convert Keras model to frozen ConcreteFunction
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()
# Print the input and output tensors
print("Frozen model inputs: ", frozen_func.inputs)
print("Frozen model outputs: ", frozen_func.outputs)
# Save frozen graph to disk
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir='.',
name='ExampleModel.pb',
as_text=False)
Inspecting the ExampleModel.pb with Netron this is what happens:
In order to fix this error, I made a custom nn.Conv2d converter:
@nobuco.converter(nn.Conv2d)
def converter_Conv2d(self, input: Tensor):
weight = self.weight
bias = self.bias
groups = self.groups
padding = self.padding
stride = self.stride
dilation = self.dilation
out_filters, in_filters, kh, kw = weight.shape
weights = weight.cpu().detach().numpy()
if groups == 1:
weights = tf.transpose(weights, (2, 3, 1, 0))
else:
weights = tf.transpose(weights, (2, 3, 0, 1))
if bias is not None:
biases = bias.cpu().detach().numpy()
params = [weights, biases]
use_bias = True
else:
params = [weights]
use_bias = False
if isinstance(dilation, numbers.Number):
dilation = (dilation, dilation)
if isinstance(padding, numbers.Number):
padding = (padding, padding)
pad_str = 'valid'
pad_layer = None
if padding == 'same':
pad_str = 'same'
elif padding != (0, 0):
pad_layer = keras.layers.ZeroPadding2D(padding)
if groups == 1:
conv = keras.layers.Conv2D(filters=out_filters,
kernel_size=(kh, kw),
strides=stride,
padding=pad_str,
dilation_rate=dilation,
groups=groups,
use_bias=use_bias,
weights=params
)
else:
conv = keras.layers.DepthwiseConv2D(
kernel_size=(kh, kw),
strides=stride,
padding=pad_str,
use_bias=use_bias,
activation=None,
depth_multiplier=1,
weights=params,
dilation_rate=dilation,
)
def func(input):
if pad_layer is not None:
input = pad_layer(input)
output = conv(input)
return output
return func
But I think that probably is better to fix this in the source code.
First this is an amazing project, I found it could convert and infer very well.
Then I tried to convert and train using tf, it could work but seems keras model still show static batch size shape, like 1 or 128 depending on my dummy input for convert. I could train by set trace_shape=True, if not set will fail.
And the training process could work but much slower then torch or tf orginal code running.
Could you help give some suggestions if I could speed up the training?
Hello there!
I tried to convert model from pytorch to keras and got such thing:
An initializer for variable weight of type <dtype: 'complex64'> is required for layer weight_layer. Received: None.
So this is common tf problem. And i see
nobuco/nobuco/layers/weight.py
Line 24 in f0f08d9
nobuco/nobuco/layers/weight.py
Line 12 in f0f08d9
Could we somehow make possible to push code for custom initializer like this:
def complex_initializer(base_initializer):
f = base_initializer()
def initializer(*args, dtype=tf.complex64, **kwargs):
real = f(*args, **kwargs)
imag = f(*args, **kwargs)
return tf.complex(real, imag)
return initializer
initializer=complex_initializer(tf.random_normal_initializer))
in nobuco.pytorch_to_keras(..., tf_weight_initializer=initializer, ...)
call?
self.add_weight('weight', shape=weight_shape, dtype=weight_dtype, initializer=tf_weight_initializer)
Kind regards
Traceback (most recent call last):
File "/p/i/proj/waternet_tf.py", line 5, in <module>
import nobuco
File "/p/i/proj/.venv/lib/python3.12/site-packages/nobuco/__init__.py", line 1, in <module>
from nobuco.converters.channel_ordering import t_pytorch2keras, t_keras2pytorch
File "/p/i/proj/.venv/lib/python3.12/site-packages/nobuco/converters/channel_ordering.py", line 6, in <module>
from nobuco.commons import ChannelOrder, TF_TENSOR_CLASSES
File "/p/i/proj/.venv/lib/python3.12/site-packages/nobuco/commons.py", line 3, in <module>
from keras.src.engine.keras_tensor import KerasTensor
ModuleNotFoundError: No module named 'keras.src.engine'
Both tensorflow and keras installed.
Python 3.12.2
nobuco 0.12.0
keras 3.0.5
tensorflow 2.16.0rc0
torch 2.2.0
Hi! Thanks so much for the amazing conversion tool! The readme doc is so well written 🎉
I am not an expert at all in Keras/Tensorflow, but I'm trying to convert a model from pytorch to tf that is using a softplus activation. I've read in tf docs that there is a slightly different implementation from the pytorch one, but I want to give it a try anyway. So I implemented a converter as you suggest:
@nobuco.converter(F.softplus, channel_ordering_strategy=ChannelOrderingStrategy.MINIMUM_TRANSPOSITIONS)
def converter_softplus(input):
def func(input):
return tf.keras.activations.softplus(input)
return func
with this custom converter, pytorch_to_keras(...)
goes straight green on the whole model. Then I save the model with keras_model.save(f'{export_name}.h5')
.
So far so good!
Now I want to test the model by making an inference, so I use the following code:
# prompt
dummy_prompt_keras = "Harry Potter"
input_ids_keras = tokenizer(dummy_prompt_keras, return_tensors='tf').input_ids
# loading model
keras_model = tf.keras.models.load_model(f'{export_name}.h5')
#keras_model.summary()
# inference
out = keras_model.predict(input_ids_keras)
# output
print(out)
but I get an error on the line:
keras_model = tf.keras.models.load_model(f'{export_name}.h5')
the error stacktrace is:
"stack": "---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
...
...
File ~/miniconda3/envs/deepl/lib/python3.11/site-packages/nobuco/node_converters/linear.py:102, in converter_einsum.<locals>.func.<locals>.<lambda>(operands)
100 equation = args[0]
101 operands = args[1:]
--> 102 return keras.layers.Lambda(lambda operands: tf.einsum(equation, *operands))(operands)
AttributeError: Exception encountered when calling layer \"lambda_12\" (type Lambda).
'list' object has no attribute 'shape'
Call arguments received by layer \"lambda_12\" (type Lambda):
• inputs=['tf.Tensor(shape=(None, None, 10), dtype=float32)', [['tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', ... ]]]
• mask=None
• training=None"
}
lambda_12
layer relates to an einsum operation by einops, which uses a torch.einsum under the hood I guess. I also tested with native torch.einsum but same error. So, my custom softplus layer is supposed to output a Tensor, which is then picked up by the einsum layer. However, this does not seem to be the case, as indicated by the stacktrace.
I'm using the following code for conversion:
keras_model = nobuco.pytorch_to_keras(
model,
args=[input_ids], kwargs=None,
input_shapes={input_ids: (None, None)},
inputs_channel_order=ChannelOrder.TENSORFLOW,
outputs_channel_order=ChannelOrder.TENSORFLOW,
constants_to_variables=False,
trace_shape=True,
)
This is the libs version I'm using:
nobuco 0.11.5
keras 2.15.0
torch 2.2.0+cu118
tensorflow 2.15.0.post1
Do you have any idea or any tips to get along with this error? I'm sure there is some dumb mistake I made, but I'm struggling to spot it. If you could help It would be very much appreciated.
Thanks again for the amazing work on this project!
Hi,
I am getting the following error due to the use of torch.nn.functional.log_softmax
.
So I wrote the following conversion for it:
@nobuco.converter(torch.nn.functional.log_softmax, channel_ordering_strategy=nobuco.ChannelOrderingStrategy.MINIMUM_TRANSPOSITIONS)
def log_softmax(input: torch.Tensor, dim: int = -1):
return lambda input, dim=dim: tf.nn.log_softmax(input, axis=dim)
But I am still getting an error from the nobuco converter:
Exception: Failed conversion: <function log_softmax at 0x7fa24fdf9820>
This is the class where it fails at:
class OutBlock(nn.Module):
def __init__(self):
super(OutBlock, self).__init__()
self.conv = nn.Conv2d(256, 1, kernel_size=1) # value head
self.bn = nn.BatchNorm2d(1)
self.fc1 = nn.Linear(10 * 10, 128)
self.fc2 = nn.Linear(128, 1)
self.conv1 = nn.Conv2d(256, 128, kernel_size=1) # policy head
self.bn1 = nn.BatchNorm2d(128)
self.logsoftmax = nn.LogSoftmax(dim=1)
self.fc = nn.Linear(10 * 10 * 128, 10 * 10 * 36)
def forward(self, s):
v = F.relu(self.bn(self.conv(s))) # value head
v = v.view(-1, 10 * 10) # batch_size X channel X height X width
v = F.relu(self.fc1(v))
v = torch.tanh(self.fc2(v))
p = F.relu(self.bn1(self.conv1(s))) # policy head
p = p.view(-1, 10 * 10 * 128)
p = self.fc(p)
p = self.logsoftmax(p).exp()
return p, v
Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.