mkocabas / coordconv-pytorch Goto Github PK

Pytorch implementation of CoordConv introduced in 'An intriguing failing of convolutional neural networks and the CoordConv solution' paper. (https://arxiv.org/pdf/1807.03247.pdf)

Python 100.00%

coordconv-pytorch's People

Contributors

Stargazers

Watchers

coordconv-pytorch's Issues

Bug for non-square input tensor

There is a bug in the original implementation as they using x_dim to tile the x_range tensor.
It will result in xx_channel with shape [x_dim, x_dim] and yy_channel with shape [y_dim, y_dim]
and then the concatenate is not allowed for non-square case.

Please refer to correct implementation as
https://gist.github.com/leVirve/0377a8fbac455bfd44e374e5cf8b1260

I find some erro?

you realize another channel,but i found it's wrong or i don't understand it ?
L104-you want add a new channel with r-->radius
but this xx_channel is not range 0-1;
so i change it to below and i change some place to reduce the transpose:

class AddCoords(nn.Module):

    def __init__(self, with_r=False):
        super().__init__()
        self.with_r = with_r

    def forward(self, input_tensor):
        """
        Args:
            input_tensor: shape(batch, channel, x_dim, y_dim)
        """
        batch_size, _, x_dim, y_dim = input_tensor.size()

        xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1).transpose(1, 2)
        yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1)
        print(xx_channel)

        xx_channel = xx_channel.repeat(batch_size, 1, 1, 1)
        yy_channel = yy_channel.repeat(batch_size, 1, 1, 1)

        xx_channel_01 = xx_channel.float() / (x_dim - 1)
        yy_channel_01 = yy_channel.float() / (y_dim - 1)

        xx_channel = xx_channel_01 * 2 - 1
        yy_channel = yy_channel_01 * 2 - 1

        ret = torch.cat([
            input_tensor,
            xx_channel.type_as(input_tensor),
            yy_channel.type_as(input_tensor)], dim=1)

        if self.with_r:
            rr = torch.sqrt(torch.pow(xx_channel_01.type_as(input_tensor) - 0.5, 2) + torch.pow(yy_channel_01.type_as(input_tensor) - 0.5, 2))
            ret = torch.cat([ret, rr], dim=1)

        return ret

another ways to realize it is to use "torch.meshgrid"

Error with code notation.

Hi @mkocabas,

There is a slight mismatch in the notation used to denote the extracted dimensions from the shape of the input tensor. here

batch_size, _, x_dim, y_dim = input_tensor.size()

You extract x_dim from dimension number 2 and y_dim from dimension number 3. As per the image based 4d tensors of Pytorch, it is the height that comes before width. Please refer to the documentation of Pytorch's Conv2d layer. Notice the input and output tensor shapes.

The code would run fine since the extracted dimensions end up in right place when constructing the output tensor, but for a code reader like me (implementation checking), it seems like a mistake at first glance. Perhaps, the notation can be changed as per Pytorch.
Thanks

Cheers!
@akanimax

when you offer the caffe code?

Training time increases linearly for each epoch.

Currently, I have been using this implementation for one of my object detection models.
As the model is training, batch processing time increases linearly. I still haven't figured out why.

Does anyone see a similar behavior?

Object Detection Example

Is there an example code integrating this with FasterRCNN as mentioned in the paper?

It seems to me like this solution has potential impact on localization tasks applied to OCR. I'd be curious to see what impact it would have on the ICDAR 2013 Table Competition dataset.

See https://arxiv.org/pdf/1804.06236.pdf for Reference

channel errors

Hello,
At the code,

CoordConv-pytorch/CoordConv.py

Line 115 in d833f1e

self.conv = nn.Conv2d(in_channels + 2, out_channels, **kwargs)

if with_r == True,this should be in_channels + 3 instead of in_channels + 2

mkocabas / coordconv-pytorch Goto Github PK

coordconv-pytorch's People

Contributors

Stargazers

Watchers

Forkers

coordconv-pytorch's Issues

Bug for non-square input tensor

I find some erro?

Error with code notation.

when you offer the caffe code?

Training time increases linearly for each epoch.

Object Detection Example

channel errors

About the position of coord conv layers in the regression network?

Why range is [-1,1]?

Shifted radius channel?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs