I would like to build rotation equivariant networks in the UNet family, but it seems d

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Feature Request: Equivariant Downsampling and Upsampling about e2cnn HOT 10 CLOSED

quva-lab commented on July 20, 2024

Feature Request: Equivariant Downsampling and Upsampling

from e2cnn.

Comments (10)

drewm1980 commented on July 20, 2024 1

Thanks Gabriele, it does!

RE the question of whether we're numerically testing the equivariance of our model, I have an anecdote for you... we once had a customer rotate the object we were analyzing and report the variation in the output of our algorithm as a bug. They didn't have a way of testing if our numbers were actually correct but it was really easy for them to test if our algorithm was rotation invariant!

from e2cnn.

Gabri95 commented on July 20, 2024

Hi @drewm1980

What you say is right. I think you are looking for these:

Downsampling: https://quva-lab.github.io/e2cnn/api/e2cnn.nn.html#pooling
Upsampling: https://quva-lab.github.io/e2cnn/api/e2cnn.nn.html#upsampling

Upsampling supports different options though, in practice, we found bilinear interpolation to work better.
We implemented different downsampling algorithms, as operations like max-pooling are not always compatible with any type of representation.
In case you use scalar fields, regular fields or quotient fields, you can use (pointwise) max-pooling, which acts as usual in Pytorch.
Average pooling is always compatible with any representation.
To get better stability, I would suggest using the antialiased versions of the downsampling methods.

Is this what you were looking for?

Best,
Gabriele

from e2cnn.

drewm1980 commented on July 20, 2024

Hi Gabriele,

Thanks for the response. I read more of the code... R2Conv seems like it is closely related, though it seems to use trainable weights, instead of a fixed anti-alias filter.

My first forays with this will indeed be with scalar fields, although I already have applications in mind for vector fields (as outputs).

I would like to only use equivariant operators unless there is a good reason not to. Equivariance is why I'm here after all :) My understanding is that max pooling and average pooling (presumably over rectangular regions) breaks equivariance, but I will look further into the antialiased versions of the operators you mention.

Do you know of published networks doing down and up sampling based on this toolbox? I would not at all be surprised if someone got to this before me.

from e2cnn.

drewm1980 commented on July 20, 2024

PointwiseAvgPoolAntialiased looks like the go-to operator for correct (antialiased) downsampling. "Antialiased channel-wise average-pooling: each channel is treated independently. It performs strided convolution with a Gaussian blur filter." I am curious why you compute a gaussian blur kernel rather than using the "Tri-3" or "Bin-5" filters from the "Making convolutions shift invariant again" paper. Is it closer to being a radial function at small filter sizes since it's not constrained to integers by construction? It's certainly more generic. To do 2X downsampling (stride=2), is sigma=1 a good starting point? That results in a 7x7 kernel.

from e2cnn.

Gabri95 commented on July 20, 2024

Hi @drewm1980

Indeed, I think antialiased average pooling is what you are looking for.
In many cases, though, it seems like using a stride > 1 in your previous convolutional layer is already good enough since the learnable convolutional filters are already quite smooth (we use a band-limited basis). This, of course, depends on the specific case and on how perfectly equivariant you want the model to be.

In the anti-aliased pooling, we use Gaussian filters for simplicity since they are perfectly rotation invariant (analytically, i.e. in the continuous domain). I guess that after discretization, there is little difference between using their filters and a Gaussian blur, though we did not experiment with them.

Regarding the downsampling, it does depend on you task. Using larger filters (larger sigma) is, of course, giving more stable results but will also result in less expressive networks (you will smooth your features too much and drop too much high-frequency information). I think you should experiment with different sizes. I often found 5x5 filters (so sigma = 2/3) to be good enough to give acceptable results. If you do not care about numerically testing the equivariance of your model but only aim for higher performance, I think you can often just use strided convolution (as suggested above), as it requires fewer computations.

Hope this answers your question!

Best,
Gabriele

from e2cnn.

page200 commented on July 20, 2024

Some info from this thread might be missing in the documentation of the pooling layers. In their documentation, it seems not obvious what kind of equivariance they have. Or is that implied by something?

from e2cnn.

Gabri95 commented on July 20, 2024

Hi @page200

Are you referring my first message? In particular:

We implemented different downsampling algorithms, as operations like max-pooling are not always compatible with any type of representation.
In case you use scalar fields, regular fields or quotient fields, you can use (pointwise) max-pooling, which acts as usual in Pytorch.
Average pooling is always compatible with any representation.

The documentation of PointwiseMaxPooling mentions this:

Notice that not all representations support this kind of pooling. In general, only representations which support pointwise non-linearities do.

If you refer to the comments on anti-aliasing, I agree these are not really discussed enough in the docs.
I will update it with some additional notes about it, thanks for pointing this out!

Best,
Gabriele

from e2cnn.

page200 commented on July 20, 2024

On one hand yes, about anti-aliasing. On the other hand, the docstrings don't make it obvious which layers have what kind of equivariance. Maybe instead of "max-pooling" in the first sentence of each layer's description you could write something like "G-equivariant max-pooling, where G is ...". Thanks!

from e2cnn.

Gabri95 commented on July 20, 2024

In that sense, max pooling is supposed to be equivariant to any group G.
Of course, this is in practice not true since max pooling breaks equivariance to continuous rotations.
Indeed, max pooling can be perfectly equivariant only to 90 degrees rotations and reflections (like all operations in the library) since these are the only perfect symmetries of the grid.
Is this what you meant?

Best,
Gabriele

from e2cnn.

page200 commented on July 20, 2024

I meant that, and I meant another thing: The docstring of the layer doesn't state yet whether the layer is equivariant. And if it is equivariant under some group G, which input variable contains the info (in what format) regarding what the current G is? The docstring should start with something like "G-equivariant max-pooling, where G is given by ...".

from e2cnn.

Feature Request: Equivariant Downsampling and Upsampling about e2cnn HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs