Comments (14)
hi @JindongJiang , Could you please tell how to convert .npz model into .caffemodel. Thanks in advance
from tensorlayer.
Yes. By using caffe-tensorflow .
First, follow the instructions in caffe-tensorflow e.g. :
convert.py VGG_ILSVRC_19_layers_deploy.prototxt --caffemodel VGG_ILSVRC_19_layers.caffemodel --data-output-path=vgg19.npy
You can get prototxt and caffemodel here: Model Zoo.
Then you get vgg19.npy
that stores pre-trained weights. Since npy
here share different structure with npz
which was commonly used in TensorLayer, you still need another step to extract the weighted (python3):
npy = np.load(vgg_weights_dir, encoding='latin1')
params = []
for val in sorted(npy.item().items()):
if val[0] == 'conv5_2':
break
params.append(val[1]['weights'])
params.append(val[1]['biases'])
tl.files.assign_params(sess, params, network)
The if val[0] == 'conv5_2': break
means to skip the weights after conv5_2, which is just for my own project.
from tensorlayer.
@JindongJiang what network graph for VGG19 do you use with that loading code?
I guess one could also need to transpose the caffe weights as often in tensorflow/tensorlayer the network uses (width, height, channels) format instead of (channels, width, height) as in caffe for example?
from tensorlayer.
@joekr552 The graph I use comes from one of the TensorLayer examples.
Well, without going deeper into caffe-tensorflow's code, I found by experimenting on jupyter notebook that caffe-tensorflow already did that for us.
from tensorlayer.
Aha, ok! thanks!
Did you check the GBR and RGB channel ordering as well?
from tensorlayer.
@joekr552 I just check caffe-tensorflow code for you. The good news is that they did transpose the c, w, h.
def transform_data(self):
if self.params is None:
transformers = [
# Reshape the parameters to TensorFlow's ordering
DataReshaper({
# (c_o, c_i, h, w) -> (h, w, c_i, c_o)
NodeKind.Convolution: (2, 3, 1, 0),
# (c_o, c_i) -> (c_i, c_o)
NodeKind.InnerProduct: (1, 0)
}),
# Pre-process batch normalization data
BatchNormPreprocessor(),
# Convert parameters to dictionaries
ParameterNamer(),
]
self.graph = self.graph.transformed(transformers)
self.params = {node.name: node.data for node in self.graph.nodes if node.data}
return self.params
But I can't find the code where they transpose the BGR into RGB, so I think they did not. And that also explain why we should do the transpose by hand at the TL vgg19 example
if tf.__version__ <= '0.11':
red, green, blue = tf.split(3, 3, rgb_scaled)
else: # TF 1.0
print(rgb_scaled)
red, green, blue = tf.split(rgb_scaled, 3, 3)
assert red.get_shape().as_list()[1:] == [224, 224, 1]
assert green.get_shape().as_list()[1:] == [224, 224, 1]
assert blue.get_shape().as_list()[1:] == [224, 224, 1]
if tf.__version__ <= '0.11':
bgr = tf.concat(3, [
blue - VGG_MEAN[0],
green - VGG_MEAN[1],
red - VGG_MEAN[2],
])
else:
bgr = tf.concat([
blue - VGG_MEAN[0],
green - VGG_MEAN[1],
red - VGG_MEAN[2],
], axis=3)
from tensorlayer.
thanks!
from tensorlayer.
Actually I am a bit confused on whatever the approach used to get the weights used in the vgg16 example in TL is a good approach: I don't think it is a valid approach, the approach you mentioned above does seem correct however.
See my discussion post here> https://www.cs.toronto.edu/~frossard/post/vgg16/
from tensorlayer.
You're welcome. But the post in your link seems more like a tutorial than an issue, where are your confusion.
from tensorlayer.
I am referring to my comment at the bottom of the page :
"Joel Kronander to Davi Frossard • a day ago
To me it seems weird/wrong to just reorganize the filter channels in the first layers?
Later layers assume that they were assigned in that order to specific color channels?!
Ie if a higher order filter wants to get high activity for say a red edge, by re-ordering first layer filter weights this is effectively turned into a blue edge"
My issue was that in the blog post the author said he post-processed the weights after running the caffe-tensorflow code, which included a step explained as: "..reorder the filters in the first layer so that they are in RGB order (instead of the default BGR. "
To me that is a strange thing to do as consecutive higher filters will assume that the first layer filters where in BGR order. An extreme example is that a "red cat" detector would become all confused.
The approach you mentioned above, where the input is first transformed to BGR from RGB should be correct though.
from tensorlayer.
@joekr552 That's a tricky yet reasonable confusion, thank you so much for mention it here. But after a second thought, I figure that maybe the author's method is rational. Let me put it in this way:
Just looking at the input layer I_rgb with RGB format and the weight W_rgb (conv kernel) along with the first hidden layer H.
H = I_rgb * W_rgb
That is, all neurons in the H receive some particular area of I
convoluted by W
. While it seems that to get the same value of neuron when the order of the channel of I
change, all we need to do is modify the order of W
.
After we make sure that
I_bgr * W_bgr == I_rgb * W_rgb
in the first hidden layer, we know the later� hidden layer wouldn't face the problem you mentioned.
from tensorlayer.
Hmm, yes u are right. Thanks for the clarification.
from tensorlayer.
Hi @prabhjot-singh-gogana , sorry for my late response, the graduate application takes much of my time.
I might not be so familiar with TensorLayer now since I have switched to Tensorflow and PyTorch. But I think onnx could help converting the model, you may wanna try it.
from tensorlayer.
Hi @prabhjot-singh-gogana, Have you solved your conversion problem yet? I faced the similar issue about converting numpy .npz format model to .caffemodel. Can I discuss this problem with you? Many thanks for that!
from tensorlayer.
Related Issues (20)
- Possible Arbitrary code execution bug. HOT 5
- examples/reinforcement_learning/tutorial_A3C.py Training failure to converge HOT 1
- SRGAN转为.pb HOT 2
- tl.layers.DropoutLayer 用于构建tf.estimator.Estimator,训练/预测模式切换时 报错 ‘ValueError: Variable model/relu1/W does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?’ HOT 2
- Question about the implementation of the 'Jaccard' Dice coefficient HOT 2
- Performance issues in examples/ HOT 3
- Performance issue in the definition of read_and_decode, examples/data_process/tutorial_tfrecord.py(P1) HOT 2
- Problem with the 2nd order derivative using TL activations
- module 'tensorflow.python.framework.ops' has no attribute '_TensorLike' , This error is reported after the program runs HOT 5
- How is the loss calculated about actor in A3C
- Questions about PPO HOT 1
- AttributeError: module 'tensorflow.python.framework.ops' has no attribute '_TensorLike' HOT 2
- AttributeError: 'str' object has no attribute 'decode' HOT 6
- IndexError: list index out of range
- Grammatical error HOT 1
- A problem about using cuda() HOT 2
- module 'tensorflow' has no attribute 'placeholder' HOT 1
- 如何使用贝尔曼期望方程计算价值函数 V (s)?
- 'tensorflow.python.framework.ops.EagerTensor' object has no attribute '_info' HOT 1
- There are errors in the source code(源代码写的有错误)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorlayer.