tscohen / groupy Goto Github PK
View Code? Open in Web Editor NEWGroup Equivariant Convolutional Neural Networks
Home Page: http://ta.co.nl
License: Other
Group Equivariant Convolutional Neural Networks
Home Page: http://ta.co.nl
License: Other
Hi Dr. Cohen - thanks so much for providing the GrouPy and gconv_experiments repos.
I was wondering about the correct way to implement a coset max-pool on the output of y in your Tensorflow example:
# Construct graph
x = tf.placeholder(tf.float32, [None, 9, 9, 3])
gconv_indices, gconv_shape_info, w_shape = gconv2d_util(
h_input='Z2', h_output='D4', in_channels=3, out_channels=64, ksize=3)
w = tf.Variable(tf.truncated_normal(w_shape, stddev=1.))
y = gconv2d(input=x, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info)
gconv_indices, gconv_shape_info, w_shape = gconv2d_util(
h_input='D4', h_output='D4', in_channels=64, out_channels=64, ksize=3)
w = tf.Variable(tf.truncated_normal(w_shape, stddev=1.))
y = gconv2d(input=y, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info)
...
print y.shape # (10, 9, 9, 512)
My understanding is that the last dimension, 512, comes from the output of 64 channels in the last layer multiplied by 8 (the number of output transformations: 4 rotations, each one is flipped, so 8 total for D4). I assume I'd want to implement a coset maxpool on this output, so that the dimensions are (10, 9, 9, 64) before feeding it to the next layer. (Is this assumption correct)?
I'm not very familiar with Chainer and I'm having a bit of trouble analogising the Chainer code in gconv_experiments over to Tensorflow. I'd appreciate any guidance on recreating the following lines from your paper:
"Next, we replaced each convolution by a p4-convolution
(eq. 10 and 11...and added max-pooling over rotations after the last
convolution layer."
and
"We took
the Z2CNN, replaced each convolution layer by a p4-
convolution (eq. 10) followed by a coset max-pooling over
rotations. "
Is there a distinction between max-pooling over rotations, vs coset max-pooling over rotations?
My best guess would be to do something like the following - would this be correct?
y = gconv2d(input=y, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info) # (10, 9, 9, 512)
y_reshaped = tf.reshape(y, [-1, 9, 9, 64, 8]) # break the flat 512 into 64 x 8
y_maxpooled = tf.reduce_max(y_reshaped, reduction_indices=[4]) # take max along the last dimension
Thank you so much!
Is G-Pooling included in this implementation?
I'm puzzled by the number of trainable parameters in networks using gconv2d
.
The script below creates a network using gconv2d
s from Z2 to C4 to C4 and counts the number of learnable parameters in the network.
For the groups C4
and D4
, the result is S
times larger different than what I expected, where S
is the number of non-translation transformations, i.e. roto-flips (so S=4
for C4
and S=8
for D4
).
Specifically, I'd expect there to be the same number of learnable parameters in a gconv2d
layer as in a a normal 2D conv layer (namely n_feat_maps_in*n_feat_maps_out*kernel_size**2
, when both have no biases, as is the case for this repository).
So, for C4
, the number of parameters I would expect to be learnable in the example below would be 135 + 315
, when it turns out to instead be 135 + 315*4
. Similarly for D4
, we get 135 + 315*8
.
I understand how the total number of parameters should be 135 + 315*4
for C4
and 135 + 315*8
for D4
, since the filters are practically speaking different (in that they have been roto-flipped).
However, I don't think that they should all be individually learnable (since the roto-flip transformations are not learnable), and I'm worried that there may be a problem in the implementation.
It could also very well be that I have misunderstood something fundamental, but isn't the whole point of gconv
s related to a group G that they are equivariant to the transformations in G without an increase in the number of trainable parameters?
Finally, the test for equivariance at the end of the below script also fails. Is this related, or am I testing the wrong thing?
For the record, I'm finding the same when using the keras_gcnn Keras implementation, i.e., I get the same (and higher than expected) number of trainable parameters when using the model.summary() method of Keras.
Thank you for your time, and for this awesome work!
import numpy as np
import tensorflow as tf
from groupy.gconv.tensorflow_gconv.splitgconv2d import gconv2d, gconv2d_util
# Model parameters
kernel_size = 3
n_feat_maps_0 = 3
n_feat_maps_1 = 5
n_feat_maps_2 = 7
group_0 = 'Z2'
group_1 = 'C4'
group_2 = 'C4' # Not currently implemented for C4 --> D4
# Construct graph
x = tf.placeholder(tf.float32, [None, 9, 9, n_feat_maps_0])
# Z2 --> C4 convolution
gconv_indices, gconv_shape_info, w_shape = gconv2d_util(
h_input=group_0, h_output=group_1, in_channels=n_feat_maps_0, out_channels=n_feat_maps_1, ksize=kernel_size)
w = tf.Variable(tf.truncated_normal(w_shape, stddev=1.))
y = gconv2d(input=x, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info)
# C4 --> C4 convolution
gconv_indices, gconv_shape_info, w_shape = gconv2d_util(
h_input=group_1, h_output=group_2, in_channels=n_feat_maps_1, out_channels=n_feat_maps_2, ksize=kernel_size)
w = tf.Variable(tf.truncated_normal(w_shape, stddev=1.))
y = gconv2d(input=y, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info)
# Compute
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
output = sess.run(y, feed_dict={x: np.random.randn(10, 9, 9, 3)})
print(output.shape) # (10, 9, 9, 28)
# Count the number of trainable parameters
print(np.sum([np.prod(v.shape) for v in tf.trainable_variables()])) # 1395 (135 + 315*4)
# Test equivariance by comparing outputs for rotated versions of same datapoint
datapoint = np.random.randn(9, 9, 3)
input = np.stack([datapoint, np.rot90(datapoint)])
output = sess.run(y, feed_dict={x: input})
print(np.allclose(output[0], np.rot90(output[1]))) # False
sess.close()
Hey Taco,
The paper 3D G-CNNs for Pulmonary Nodule Detection refers to this repository for an opensource implementation of the GConv3D
function.
I found the 2D variant, but no 3D GConv.
Is there a plan to release the code for the GConv3D
function?
Thank you!
Hey Dr.Cohen,
Thank you for your amazing work in group convolutions/steerable convolutions. I was wondering if these gconvs can directly replace the 2d convs in models like AlexNet or GoogleNet? If yes, can pretrained weights be used? Or would we have to train the network from scratch?
Thanks!
Hi Dr. Cohen,
Thank you for releasing GrouPy.
I'd like to build an architecture that is similar to VGG16. If we ignore the computational cost without reducing the number of filters. Can I achieve p4m properties by simply inserting the following layers right after vgg5_3, please? (we retain all of CNN layers in VGG16)
gconv_indices, gconv_shape_info, w_shape = gconv2d_util(
h_input='Z2', h_output='D4', in_channels=3, out_channels=64, ksize=3)
w = tf.Variable(tf.truncated_normal(w_shape, stddev=1.))
y = gconv2d(input=x, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info)
gconv_indices, gconv_shape_info, w_shape = gconv2d_util(
h_input='D4', h_output='D4', in_channels=64, out_channels=64, ksize=3)
w = tf.Variable(tf.truncated_normal(w_shape, stddev=1.))
y = gconv2d(input=y, filter=w, strides=[1, 1, 1, 1], padding='SAME',
gconv_indices=gconv_indices, gconv_shape_info=gconv_shape_info)
Or, can i say that I can achieve p4m by training dense layers and above layers and then fine-tuning the whole networks?
After training, can i say they have properties, such as symmetry and orthogonal?
Suppose that we obtain a feature in size of (batch_size, height, width, channel*8), can we achieve transformation-invariant feature by reducing it into (batch_size, height, width, channel). If we can, which is better between reduce_mean and reduce_max?
Thanks!
Hi! I was wondering if the Tensorflow code was written for Python 2.7 or Python 3.
Currently, running the sample code I got no error with Python 2.7, but "ValueError: ("Size of label '%s' for operand %d does not match previous terms.", 'G', 1)" when running on Python3.
Thank you!
I get some trouble when I install Groupy according to README.md.
After installing chainer with cupy and tensorflow-gpu, I run "chainer.backends.cuda.available" and "chainer.backends.cuda.cudnn_enabled", they all return True. However, when I run "nosetests -v", it shows as below.
Failure: CompileException (/tmp/tmpezKDyD/kern.cu(14): error: a value of type "const ptrdiff_t *" cannot be used to initialize an entity of type "const int *"
/tmp/tmpezKDyD/kern.cu(15): error: a value of type "const ptrdiff_t *" cannot be used to initialize an entity of type "const int *"
I just skiped it and run the examples in the "Getting Started". The tensorflow example can run without any error, but the chainer example has some wrong as shown belowed.
File "/data/mbqiu/anaconda3/envs/gconv-python2.7/lib/python2.7/site-packages/groupy/gconv/chainer_gconv/transform_filter.py", line 8, in
from groupy.gconv.chainer_gconv.kernels.integer_indexing_cuda_kernel import grad_index_group_func_kernel
ImportError: No module named kernels.integer_indexing_cuda_kernel
I checked carefully and found that the two subdirectories--kernels and pooling--is lost from gconv/chainer_gconv/.
my version is ubuntu==16.04, cuda==8.0, chainer==4.5.0, cupy-cuda80==4.5.0, tensorflow-gpu=1.4.0.
What's wrong?
Grouped equivariant CNNs are just one example of fully steerable CNNs - is this framework easily extensible to continuous fully steerable CNN representations?
(we met at ICLR a month ago and this seemed like the easiest way to start a conversation)
Hello Dr.Cohen,
your group-conv is a great job to preserve rotatation and translation equivariance. I wonder if there is any work about size equivariance, assuming the feature maps and filters stored in infinite arrays.
Thank you for giving a fascinating tutorial in NeurIPS 2020 about equivariance, I am very interested and new in this field. After reading your paper, I have a question about the merging of the dimension of the input channel and groups. If we rotate the whole original image, in my understanding, the equivariant transformation is to permute the order in the group-dimension in the second feature representations (ignoring the translation). But if you merge this dimension into input-channels, how to keep the permutation equivariance in the following features? In short, for example, what is the operation for the last feature representations equivariant to the rotation for the whole original image.
CuPy (cupy) version 6.0.0 may not be compatible with this version of Chainer.
Please consider installing the supported version by running:
$ pip install 'cupy>=6.3.0,<7.0.0'
requirement=requirement, help=help))
groupy.garray.test_garray.test_p4_array ... ok
groupy.garray.test_garray.test_p4m_array ... ok
groupy.garray.test_garray.test_z2_array ... ok
groupy.garray.test_garray.test_c4_array ... ok
groupy.garray.test_garray.test_d4_array ... ok
Failure: TypeError (Argument 'source' has incorrect type (expected unicode, got str)) ... ERROR
groupy.gfunc.test_gfuncarray.test_p4_func ... /home/shenzy/lsd_temp_file/GrouPy/groupy/gfunc/gfuncarray.py:78: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use arr[tuple(seq)]
instead of arr[seq]
. In the future this will be interpreted as an array index, arr[np.array(seq)]
, which will result either in an error or a different result.
vi = self.v[inds]
/home/shenzy/lsd_temp_file/GrouPy/groupy/garray/garray.py:144: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use arr[tuple(seq)]
instead of arr[seq]
. In the future this will be interpreted as an array index, arr[np.array(seq)]
, which will result either in an error or a different result.
return self.factory(data=self.data[key], p=self.p)
ok
groupy.gfunc.test_gfuncarray.test_p4m_func ... ok
groupy.gfunc.test_gfuncarray.test_z2_func ... ok
Traceback (most recent call last):
File "/home/shenzy/anaconda3/envs/gconv/lib/python2.7/site-packages/nose/loader.py", line 418, in loadTestsFromName
addr.filename, addr.module)
File "/home/shenzy/anaconda3/envs/gconv/lib/python2.7/site-packages/nose/importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/home/shenzy/anaconda3/envs/gconv/lib/python2.7/site-packages/nose/importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/home/shenzy/lsd_temp_file/GrouPy/groupy/gconv/chainer_gconv/init.py", line 2, in
from groupy.gconv.chainer_gconv.p4_conv import P4ConvZ2, P4ConvP4
File "/home/shenzy/lsd_temp_file/GrouPy/groupy/gconv/chainer_gconv/p4_conv.py", line 1, in
from groupy.gconv.chainer_gconv.splitgconv2d import SplitGConv2D
File "/home/shenzy/lsd_temp_file/GrouPy/groupy/gconv/chainer_gconv/splitgconv2d.py", line 10, in
from groupy.gconv.chainer_gconv.transform_filter import TransformGFilter
File "/home/shenzy/lsd_temp_file/GrouPy/groupy/gconv/chainer_gconv/transform_filter.py", line 8, in
from groupy.gconv.chainer_gconv.kernels.integer_indexing_cuda_kernel import grad_index_group_func_kernel
File "/home/shenzy/lsd_temp_file/GrouPy/groupy/gconv/chainer_gconv/kernels/integer_indexing_cuda_kernel.py", line 61, in
_index_group_func_kernel32 = compile_with_cache(_index_group_func_str.format('float')).get_function('indexing_kernel')
TypeError: Argument 'source' has incorrect type (expected unicode, got str)
Ran 9 tests in 1.018s
Hey Dr.Cohen,
Thank you for your creative work in group convolutions.I get some trouble when I install Groupy according to README.md.
After installing chainer with cupy and tensorflow-gpu, I run "chainer.backends.cuda.available" and "chainer.backends.cuda.cudnn_enabled", they all return True. However, when I run "nosetests -v", it shows as below.
Failure: CompileException (/tmp/tmpezKDyD/kern.cu(14): error: a value of type "const ptrdiff_t *" cannot be used to initialize an entity of type "const int *"
/tmp/tmpezKDyD/kern.cu(15): error: a value of type "const ptrdiff_t *" cannot be used to initialize an entity of type "const int *"
I just skiped it and run the examples in the "Getting Started". The tensorflow example can run without any error, but the chainer example has some wrong as shown below.
File "/data/mbqiu/anaconda3/envs/gconv-python2.7/lib/python2.7/site-packages/groupy/gconv/chainer_gconv/transform_filter.py", line 8, in from groupy.gconv.chainer_gconv.kernels.integer_indexing_cuda_kernel import grad_index_group_func_kernel
ImportError: No module named kernels.integer_indexing_cuda_kernel.
I checked carefully and found that the two subdirectories--kernels and pooling--is lost from gconv/chainer_gconv/.
my version is ubuntu==16.04, cuda==8.0, chainer==4.5.0, cupy-cuda80==4.5.0, tensorflow-gpu=1.4.0.
What's wrong?
Hi,
I have a question regarding the implementation of equivariant pooling. I found in your experiments the usage of function <plane_group_spatial_max_pooling> imported from <groupy.gconv.chainer_gconv.pooling.plane_group_spatial_max_pooling>. In this implementation, the rotation axis and channel axis are fold together, perform 2d spatial pooling and then unfold. I have used the same implementation in Pytorch and tested if my P4CNN using pooling is equivariant using the code below. I found that when I add <plane_group_spatial_max_pooling> after each GConv, the network is not equivariant and the test <test_p4_net_equivariance> fails. However, when I am not using <plane_group_spatial_max_pooling>, the GCNN is equivariant and the test <test_p4_net_equivariance> pass. I don't understand if the pooling layer implemented like this is equivariant or not, or I am testing wrong the equivariance of the network. Can you help me please?
def test_p4_net_equivariance():
from groupy.gfunc import Z2FuncArray, P4FuncArray
import groupy.garray.C4_array as c4a
im = np.random.randn(1, 1, 96, 96).astype('float32')
check_equivariance(
im=im,
layers=[
P4ConvZ2(in_channels=1, out_channels=2, kernel_size=3),
P4ConvP4(in_channels=2, out_channels=3, kernel_size=3)
],
input_array=Z2FuncArray,
output_array=P4FuncArray,
point_group=c4a,
)
def check_equivariance(im, layers, input_array, output_array, point_group):
# Transform the image
f = input_array(im)
g = point_group.rand()
gf = g * f
im1 = gf.v
# Apply layers to both images
im = Variable(torch.Tensor(im))
im1 = Variable(torch.Tensor(im1))
fmap = im
fmap1 = im1
for layer in layers:
fmap = layer(fmap)
fmap = plane_group_spatial_max_pooling(fmap, ksize=2, stride=1)
fmap1 = layer(fmap1)
fmap1 = plane_group_spatial_max_pooling(fmap1, ksize=2, stride=1)
# Transform the computed feature maps
fmap1_garray = output_array(fmap1.data.numpy())
r_fmap1_data = (g.inv() * fmap1_garray).v
fmap_data = fmap.data.numpy()
assert np.allclose(fmap_data, r_fmap1_data, rtol=1e-5, atol=1e-3)
if __name__ == '__main__':
test_p4_net_equivariance()
Hello Dr.Cohen, this is another question.
I didn't catch the meaning of GArray and GFuncArray:
A GFuncArray is an array of functions on a group G.
A GArray represents an array that contains transformations instead of scalars.
Can you explain them more clearly?
Comment says "only NCHW supported" but error says the opposite in the tensorflow implementation.
Hello @tscohen
I'm trying to use the tensorflow API in your GrouPy lib. And I faced some problem. Then I find in GrouPy/groupy/gconv/tensorflow_gconv/splitgconv2d.py that the axes of returned tensor are (batch, out channels, height, width). And I notice that the input axes are (batch, height, width, in channels).
However, in your tensorflow sample code, you simply feed the the output of the previous conv layer into the next conv layer without any reshape. Does it make sense ?
Thanks a lot!
Hi Dr Cohen
I am a newbie who just learned GCNN and I'm trying to run the test file "test_gfuncarray.py"but meet the title's error.The error in the title was reported when I calculated h*f. The error told me that the dimensions of self.v in the call method of GFuncArray may not match the index dimensions of inds. The shapes I printed out afterwards were as follows: inds shapes: [' Ellipsis', (4, 5, 5), (4, 5, 5), (4, 5, 5)], self.v shape: (2, 6, 4, 5, 5), I don’t understand gcnn yet How to make these two shapes match and why an error occurs? Is it because of the numpy version update? Can you please answer it for me?
Sorry that I did not catch the meaning of this function name quite well: P4ConvZ2, which is not mentioned in paper. Is there a one sentence illustration of what it means?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.