Comments (10)
Thanks Gabriele, it does!
RE the question of whether we're numerically testing the equivariance of our model, I have an anecdote for you... we once had a customer rotate the object we were analyzing and report the variation in the output of our algorithm as a bug. They didn't have a way of testing if our numbers were actually correct but it was really easy for them to test if our algorithm was rotation invariant!
from e2cnn.
Hi @drewm1980
What you say is right. I think you are looking for these:
- Downsampling: https://quva-lab.github.io/e2cnn/api/e2cnn.nn.html#pooling
- Upsampling: https://quva-lab.github.io/e2cnn/api/e2cnn.nn.html#upsampling
Upsampling supports different options though, in practice, we found bilinear interpolation to work better.
We implemented different downsampling algorithms, as operations like max-pooling are not always compatible with any type of representation.
In case you use scalar fields, regular fields or quotient fields, you can use (pointwise) max-pooling, which acts as usual in Pytorch.
Average pooling is always compatible with any representation.
To get better stability, I would suggest using the antialiased versions of the downsampling methods.
Is this what you were looking for?
Best,
Gabriele
from e2cnn.
Hi Gabriele,
Thanks for the response. I read more of the code... R2Conv seems like it is closely related, though it seems to use trainable weights, instead of a fixed anti-alias filter.
My first forays with this will indeed be with scalar fields, although I already have applications in mind for vector fields (as outputs).
I would like to only use equivariant operators unless there is a good reason not to. Equivariance is why I'm here after all :) My understanding is that max pooling and average pooling (presumably over rectangular regions) breaks equivariance, but I will look further into the antialiased versions of the operators you mention.
Do you know of published networks doing down and up sampling based on this toolbox? I would not at all be surprised if someone got to this before me.
from e2cnn.
PointwiseAvgPoolAntialiased looks like the go-to operator for correct (antialiased) downsampling. "Antialiased channel-wise average-pooling: each channel is treated independently. It performs strided convolution with a Gaussian blur filter." I am curious why you compute a gaussian blur kernel rather than using the "Tri-3" or "Bin-5" filters from the "Making convolutions shift invariant again" paper. Is it closer to being a radial function at small filter sizes since it's not constrained to integers by construction? It's certainly more generic. To do 2X downsampling (stride=2), is sigma=1 a good starting point? That results in a 7x7 kernel.
from e2cnn.
Hi @drewm1980
Indeed, I think antialiased average pooling is what you are looking for.
In many cases, though, it seems like using a stride > 1 in your previous convolutional layer is already good enough since the learnable convolutional filters are already quite smooth (we use a band-limited basis). This, of course, depends on the specific case and on how perfectly equivariant you want the model to be.
In the anti-aliased pooling, we use Gaussian filters for simplicity since they are perfectly rotation invariant (analytically, i.e. in the continuous domain). I guess that after discretization, there is little difference between using their filters and a Gaussian blur, though we did not experiment with them.
Regarding the downsampling, it does depend on you task. Using larger filters (larger sigma) is, of course, giving more stable results but will also result in less expressive networks (you will smooth your features too much and drop too much high-frequency information). I think you should experiment with different sizes. I often found 5x5 filters (so sigma = 2/3) to be good enough to give acceptable results. If you do not care about numerically testing the equivariance of your model but only aim for higher performance, I think you can often just use strided convolution (as suggested above), as it requires fewer computations.
Hope this answers your question!
Best,
Gabriele
from e2cnn.
Some info from this thread might be missing in the documentation of the pooling layers. In their documentation, it seems not obvious what kind of equivariance they have. Or is that implied by something?
from e2cnn.
Hi @page200
Are you referring my first message? In particular:
We implemented different downsampling algorithms, as operations like max-pooling are not always compatible with any type of representation.
In case you use scalar fields, regular fields or quotient fields, you can use (pointwise) max-pooling, which acts as usual in Pytorch.
Average pooling is always compatible with any representation.
The documentation of PointwiseMaxPooling mentions this:
Notice that not all representations support this kind of pooling. In general, only representations which support pointwise non-linearities do.
If you refer to the comments on anti-aliasing, I agree these are not really discussed enough in the docs.
I will update it with some additional notes about it, thanks for pointing this out!
Best,
Gabriele
from e2cnn.
On one hand yes, about anti-aliasing. On the other hand, the docstrings don't make it obvious which layers have what kind of equivariance. Maybe instead of "max-pooling" in the first sentence of each layer's description you could write something like "G-equivariant max-pooling, where G is ...". Thanks!
from e2cnn.
In that sense, max pooling is supposed to be equivariant to any group G
.
Of course, this is in practice not true since max pooling breaks equivariance to continuous rotations.
Indeed, max pooling can be perfectly equivariant only to 90 degrees rotations and reflections (like all operations in the library) since these are the only perfect symmetries of the grid.
Is this what you meant?
Best,
Gabriele
from e2cnn.
I meant that, and I meant another thing: The docstring of the layer doesn't state yet whether the layer is equivariant. And if it is equivariant under some group G, which input variable contains the info (in what format) regarding what the current G is? The docstring should start with something like "G-equivariant max-pooling, where G is given by ...".
from e2cnn.
Related Issues (20)
- wrapping pytorch operations - grid_sample HOT 4
- Import Error with Torch 1.9.0+cu111 HOT 2
- equivariant Transformer HOT 5
- ZeroPad2D on GeometricTensor
- Cannot pass weights of R2Conv as a positional argument HOT 2
- Counting FLOPs for e2cnn HOT 1
- equivariance in C8 space HOT 1
- Module export HOT 3
- About the equivalence of wide_resnet HOT 5
- Need a size parameter for e2cnn.R2Upsampling Class HOT 1
- about attribute R2conv.filter HOT 2
- Learning of kernels HOT 2
- O(2) group, irreps, and PyTorch DDP. HOT 2
- checking equivariance for the angles that are not 90n HOT 2
- about to set special rotation equivariant HOT 2
- Cannot import name container_abcs in python 3.6 version (e2cnn_py36)
- shriking size of output image
- Use of np.float and np.int etc
- Difference between trivial output type and regular output type with group pooling HOT 1
- Export Linear HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from e2cnn.