Comments (17)
@irexyc Yes you're right. For the net to utilize the "valid" padding strategy of convolutions, you may want to tile the (388x388) image to have a shape of 572x572 like Fig.2 from the paper (the word "resize" of my previous comment is kind of a misnomer here, and I use the model with the "tiled" CT scan images). This shows an example with mirror-padding. This may further clarify the I/O.
I think the "bilnear for input image & nearest-neighbor for binary segmentation mask" is a general practice since bilinear provides more natural & smooth interpolation for images and we want to keep the mask binary and not interpolating it.
from pytorch-semseg.
Maybe late to the discussion, but since I've PR'd the u-net fix (#35), see issue #21), Here's my comments.
A strict U-net implementation does not use padding (Fig 1 in the https://arxiv.org/pdf/1505.04597.pdf), which is the reason the padding=0 instead of 1. Several other implementations follow this (TF#1, TF#2, note the "valid" padding). So the input size should be 572x572, and the output size should be 388x388.
So an easiest method would be resizing the input & output images to match respective sizes.
Using the padding wouldn't hurt since it nicely keeps the size, but it is not an exact architecture from the paper so use it as you own risk regarding to proper benchmarks.
A quick "fix" would be raising a readable error so as to match the I/O size, or giving an on/off switch for the padding.
from pytorch-semseg.
setting padding to 1 instead of 0 worked for me.
from pytorch-semseg.
I also have the exact same issue. Can anyone help me out ?
from pytorch-semseg.
TL;DR: Size inconsistency is NOT an issue of the U-Net implementation for the original version from the paper referenced above. The original paper used a mirror-tile strategy for input images to yield a desired output dimension.
Source: https://arxiv.org/pdf/1505.04597.pdf
from pytorch-semseg.
Change padding in lines 174-183 in utils.py, unetConv2 function
if is_batchnorm:
self.conv1 = nn.Sequential(
nn.Conv2d(in_size, out_size, 3, 1, 1), nn.BatchNorm2d(out_size), nn.ReLU()
)
self.conv2 = nn.Sequential(
nn.Conv2d(out_size, out_size, 3, 1, 1), nn.BatchNorm2d(out_size), nn.ReLU()
)
else:
self.conv1 = nn.Sequential(nn.Conv2d(in_size, out_size, 3, 1, 1), nn.ReLU())
self.conv2 = nn.Sequential(nn.Conv2d(out_size, out_size, 3, 1, 1), nn.ReLU())
Make sure to check with the summary function that this is what you want to do.
from pytorch-semseg.
def getitem(self, index):
img_name = self.files[self.split][index]
img_path = self.root + '/' + self.split + '/' + img_name
lbl_path = self.root + '/' + self.split + 'annot/' + img_name
print img_path
print lbl_path
img = m.imread(img_path)
img=m.imresize(img,[360, 480], interp='nearest') # add this line
img = np.array(img, dtype=np.uint8)
lbl = m.imread(lbl_path)
lbl=m.imresize(lbl,[360, 480], interp='nearest') # add this line
lbl = np.array(lbl, dtype=np.int32)
print lbl.shape
from pytorch-semseg.
This resizing of image did not work for me. I still have the same error. Does this current implementation of unet work with (256,256) ? If not what size of image should be used ?
from pytorch-semseg.
I have the same problem. Did anyone find the solution?
from pytorch-semseg.
The problem is that unet does not have any padding in the convolution layers. So output size is not equal to input size. But the label size = input size.
from pytorch-semseg.
I'm aware of this issue, U-net implementation doesn't support all resolutions. I need to fix this.
from pytorch-semseg.
@masahi OMG.. you are the winner..
It works fine but I should see the result images after training.
from pytorch-semseg.
@masahi
After training the unet, I performed the validate.py but the following error occurred.
from pytorch-semseg.
@JustWon that error is not related to your change in padding. look elsewhere.
from pytorch-semseg.
@L0SG Hi, thanks for your explanation.
I am confused about the input size and output size. According to the paper, it uses the overlap-tile strategy for segmentation of arbitrary large images. Does it mean that we shouldn't resize the label image but select part of the label image(388 x 388) and mirror the real image(388 x 388 -> 572 x 572) ?
I am new to segmentation. Does the effect of changing the label size to the final accuracy is little? By the way, when we do data augmentation, should we use different resize method to input image/label? (pytorch/vision#9 (comment) said the input image uses bilinear while the label uses neirest-neighbour)
from pytorch-semseg.
setting padding to 1 instead of 0 worked for me.
Hello,i meet the same problems! How i set padding to 1?
from pytorch-semseg.
Can any one find the solution ? Please help me i'm new on machine learning and getting the same error.
from pytorch-semseg.
Related Issues (20)
- Where I run 'python train.py [-h] [--config [CONFIG]]' HOT 1
- pspnet training HOT 2
- About the speed results of ICNet
- SegNet on Pascal:: TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; HOT 1
- Any good results from SegNet? HOT 9
- ValueError: Segmentation map contained invalid class values HOT 2
- RecursionError: maximum recursion depth exceeded in __instancecheck__ HOT 1
- test.py error HOT 1
- Image shape changed to 352 from 360 in FRRN camvid HOT 3
- mscoco pre-trained model
- Any tip to train models from scratch using cityscape dataset? HOT 1
- Semantic Segmentation Tool
- benchmark_RELEASE
- Pretrained Models
- Problem while trying to train HardNet on CamVid dataset
- KeyError: 'name' HOT 2
- Where is model being saved after training?
- python-cdo HOT 1
- Poly learning rate scheduler not doing anything HOT 1
- Error in fcn8s_pascal.yml HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-semseg.