GithubHelp home page GithubHelp logo

Comments (17)

L0SG avatar L0SG commented on May 28, 2024 4

@irexyc Yes you're right. For the net to utilize the "valid" padding strategy of convolutions, you may want to tile the (388x388) image to have a shape of 572x572 like Fig.2 from the paper (the word "resize" of my previous comment is kind of a misnomer here, and I use the model with the "tiled" CT scan images). This shows an example with mirror-padding. This may further clarify the I/O.

I think the "bilnear for input image & nearest-neighbor for binary segmentation mask" is a general practice since bilinear provides more natural & smooth interpolation for images and we want to keep the mask binary and not interpolating it.

from pytorch-semseg.

L0SG avatar L0SG commented on May 28, 2024 3

Maybe late to the discussion, but since I've PR'd the u-net fix (#35), see issue #21), Here's my comments.

A strict U-net implementation does not use padding (Fig 1 in the https://arxiv.org/pdf/1505.04597.pdf), which is the reason the padding=0 instead of 1. Several other implementations follow this (TF#1, TF#2, note the "valid" padding). So the input size should be 572x572, and the output size should be 388x388.

So an easiest method would be resizing the input & output images to match respective sizes.

Using the padding wouldn't hurt since it nicely keeps the size, but it is not an exact architecture from the paper so use it as you own risk regarding to proper benchmarks.

A quick "fix" would be raising a readable error so as to match the I/O size, or giving an on/off switch for the padding.

from pytorch-semseg.

masahi avatar masahi commented on May 28, 2024 2

setting padding to 1 instead of 0 worked for me.

from pytorch-semseg.

shehabk avatar shehabk commented on May 28, 2024 1

I also have the exact same issue. Can anyone help me out ?

from pytorch-semseg.

alar0330 avatar alar0330 commented on May 28, 2024 1

TL;DR: Size inconsistency is NOT an issue of the U-Net implementation for the original version from the paper referenced above. The original paper used a mirror-tile strategy for input images to yield a desired output dimension.

Source: https://arxiv.org/pdf/1505.04597.pdf
image

from pytorch-semseg.

ckolluru avatar ckolluru commented on May 28, 2024 1

@lfdeep

Change padding in lines 174-183 in utils.py, unetConv2 function

if is_batchnorm:
            self.conv1 = nn.Sequential(
                nn.Conv2d(in_size, out_size, 3, 1, 1), nn.BatchNorm2d(out_size), nn.ReLU()
            )
            self.conv2 = nn.Sequential(
                nn.Conv2d(out_size, out_size, 3, 1, 1), nn.BatchNorm2d(out_size), nn.ReLU()
            )
        else:
            self.conv1 = nn.Sequential(nn.Conv2d(in_size, out_size, 3, 1, 1), nn.ReLU())
            self.conv2 = nn.Sequential(nn.Conv2d(out_size, out_size, 3, 1, 1), nn.ReLU())

Make sure to check with the summary function that this is what you want to do.

from pytorch-semseg.

hexiangquan avatar hexiangquan commented on May 28, 2024

def getitem(self, index):
img_name = self.files[self.split][index]
img_path = self.root + '/' + self.split + '/' + img_name
lbl_path = self.root + '/' + self.split + 'annot/' + img_name
print img_path
print lbl_path
img = m.imread(img_path)
img=m.imresize(img,[360, 480], interp='nearest') # add this line
img = np.array(img, dtype=np.uint8)

    lbl = m.imread(lbl_path)
    lbl=m.imresize(lbl,[360, 480], interp='nearest')    # add  this line  
    lbl = np.array(lbl, dtype=np.int32)
    print lbl.shape

from pytorch-semseg.

shehabk avatar shehabk commented on May 28, 2024

This resizing of image did not work for me. I still have the same error. Does this current implementation of unet work with (256,256) ? If not what size of image should be used ?

from pytorch-semseg.

bobbqe avatar bobbqe commented on May 28, 2024

I have the same problem. Did anyone find the solution?

from pytorch-semseg.

mileyan avatar mileyan commented on May 28, 2024

The problem is that unet does not have any padding in the convolution layers. So output size is not equal to input size. But the label size = input size.

from pytorch-semseg.

meetps avatar meetps commented on May 28, 2024

I'm aware of this issue, U-net implementation doesn't support all resolutions. I need to fix this.

from pytorch-semseg.

JustWon avatar JustWon commented on May 28, 2024

@masahi OMG.. you are the winner..
It works fine but I should see the result images after training.

from pytorch-semseg.

JustWon avatar JustWon commented on May 28, 2024

@masahi
After training the unet, I performed the validate.py but the following error occurred.

image

from pytorch-semseg.

masahi avatar masahi commented on May 28, 2024

@JustWon that error is not related to your change in padding. look elsewhere.

from pytorch-semseg.

irexyc avatar irexyc commented on May 28, 2024

@L0SG Hi, thanks for your explanation.

I am confused about the input size and output size. According to the paper, it uses the overlap-tile strategy for segmentation of arbitrary large images. Does it mean that we shouldn't resize the label image but select part of the label image(388 x 388) and mirror the real image(388 x 388 -> 572 x 572) ?

I am new to segmentation. Does the effect of changing the label size to the final accuracy is little? By the way, when we do data augmentation, should we use different resize method to input image/label? (pytorch/vision#9 (comment) said the input image uses bilinear while the label uses neirest-neighbour)

from pytorch-semseg.

lfdeep avatar lfdeep commented on May 28, 2024

setting padding to 1 instead of 0 worked for me.

Hello,i meet the same problems! How i set padding to 1?

from pytorch-semseg.

shariq-ali avatar shariq-ali commented on May 28, 2024

Can any one find the solution ? Please help me i'm new on machine learning and getting the same error.

from pytorch-semseg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.