meetps / pytorch-semseg Goto Github PK

Semantic Segmentation Architectures Implemented in PyTorch

Home Page: https://meetshah.dev/semantic-segmentation/deep-learning/pytorch/visdom/2017/06/01/semantic-segmentation-over-the-years.html

License: MIT License

Python 100.00%

pytorch semantic-segmentation deep-learning fully-convolutional-networks

pytorch-semseg's Introduction

pytorch-semseg

Semantic Segmentation Algorithms Implemented in PyTorch

This repository aims at mirroring popular semantic segmentation architectures in PyTorch.

Networks implemented

PSPNet - With support for loading pretrained models w/o caffe dependency
ICNet - With optional batchnorm and pretrained models
FRRN - Model A and B
FCN - All 1 (FCN32s), 2 (FCN16s) and 3 (FCN8s) stream variants
U-Net - With optional deconvolution and batchnorm
Link-Net - With multiple resnet backends
Segnet - With Unpooling using Maxpool indices

Upcoming

DataLoaders implemented

Requirements

pytorch >=0.4.0
torchvision ==0.2.0
scipy
tqdm
tensorboardX

One-line installation

pip install -r requirements.txt

Data

Download data for desired dataset(s) from list of URLs here.
Extract the zip / tar and modify the path appropriately in your config.yaml

Usage

Setup config file

# Model Configuration
model:
    arch: <name> [options: 'fcn[8,16,32]s, unet, segnet, pspnet, icnet, icnetBN, linknet, frrn[A,B]'
    <model_keyarg_1>:<value>

# Data Configuration
data:
    dataset: <name> [options: 'pascal, camvid, ade20k, mit_sceneparsing_benchmark, cityscapes, nyuv2, sunrgbd, vistas'] 
    train_split: <split_to_train_on>
    val_split: <spit_to_validate_on>
    img_rows: 512
    img_cols: 1024
    path: <path/to/data>
    <dataset_keyarg1>:<value>

# Training Configuration
training:
    n_workers: 64
    train_iters: 35000
    batch_size: 16
    val_interval: 500
    print_interval: 25
    loss:
        name: <loss_type> [options: 'cross_entropy, bootstrapped_cross_entropy, multi_scale_crossentropy']
        <loss_keyarg1>:<value>

    # Optmizer Configuration
    optimizer:
        name: <optimizer_name> [options: 'sgd, adam, adamax, asgd, adadelta, adagrad, rmsprop']
        lr: 1.0e-3
        <optimizer_keyarg1>:<value>

        # Warmup LR Configuration
        warmup_iters: <iters for lr warmup>
        mode: <'constant' or 'linear' for warmup'>
        gamma: <gamma for warm up>
       
    # Augmentations Configuration
    augmentations:
        gamma: x                                     #[gamma varied in 1 to 1+x]
        hue: x                                       #[hue varied in -x to x]
        brightness: x                                #[brightness varied in 1-x to 1+x]
        saturation: x                                #[saturation varied in 1-x to 1+x]
        contrast: x                                  #[contrast varied in 1-x to 1+x]
        rcrop: [h, w]                                #[crop of size (h,w)]
        translate: [dh, dw]                          #[reflective translation by (dh, dw)]
        rotate: d                                    #[rotate -d to d degrees]
        scale: [h,w]                                 #[scale to size (h,w)]
        ccrop: [h,w]                                 #[center crop of (h,w)]
        hflip: p                                     #[flip horizontally with chance p]
        vflip: p                                     #[flip vertically with chance p]

    # LR Schedule Configuration
    lr_schedule:
        name: <schedule_type> [options: 'constant_lr, poly_lr, multi_step, cosine_annealing, exp_lr']
        <scheduler_keyarg1>:<value>

    # Resume from checkpoint  
    resume: <path_to_checkpoint>

To train the model :

python train.py [-h] [--config [CONFIG]] 

--config                Configuration file to use

To validate the model :

usage: validate.py [-h] [--config [CONFIG]] [--model_path [MODEL_PATH]]
                       [--eval_flip] [--measure_time]

  --config              Config file to be used
  --model_path          Path to the saved model
  --eval_flip           Enable evaluation with flipped image | True by default
  --measure_time        Enable evaluation with time (fps) measurement | True
                        by default

To test the model w.r.t. a dataset on custom images(s):

python test.py [-h] [--model_path [MODEL_PATH]] [--dataset [DATASET]]
               [--dcrf [DCRF]] [--img_path [IMG_PATH]] [--out_path [OUT_PATH]]
 
  --model_path          Path to the saved model
  --dataset             Dataset to use ['pascal, camvid, ade20k etc']
  --dcrf                Enable DenseCRF based post-processing
  --img_path            Path of the input image
  --out_path            Path of the output segmap

If you find this code useful in your research, please consider citing:

@article{mshahsemseg,
    Author = {Meet P Shah},
    Title = {Semantic Segmentation Architectures Implemented in PyTorch.},
    Journal = {https://github.com/meetshah1995/pytorch-semseg},
    Year = {2017}
}

pytorch-semseg's People

Contributors

Stargazers

Watchers

Forkers

ml-lab shimmeringvoid beniz weigq pranoothatwar mydp2017 ml-ai-nlp-ir baiyancheng20 abunaser71 josephreisinger donnyyou ibadami yiweichen04 gninnur kovacspeter reemhal soledad89 acgtyrant peteflorence udacity-repo melights sachinchandra longlong-jing wheatdog emfkrhs89 hellodrx rbunn80110 felicia126 cshaoping jiaenyue wwwanghao styleflow rohun-tripathi wang-shuo thomasdic2000 guo2004131 hyzcn hkcaesar chichivica kolyvoshko m-kasem l0sg denethor1997 keunwoochoi ps793 bestlin neuralnetworkingtechnologies 4f2e4a2e serjioteh ronildomoura grseb9s wangguangfu mahlermozart lraxue daisenryaku gayathrimohan01 xiaofengqing tbetterlife hopkinslaurel xiaojianggis martinarjovsky hyuantan muzi-8 sriharsha0806 wotulong yaohuax jcjs daveredrum hbcbh1999 prefiredman davidemaz shehabk ywangeq undercontroller ravi-code-ranjan ubiquity6 jiaxiangzheng zbxzc35 justwon mohanarunachalam raysome albanie lucasbrynte guanfuchen liu3xing3long zenozhouzhao szupzp c8pan sinianyutian aaltovision willdamon qixuxiang sheacoding khushhallchandra g380909685 mrastgoo qilicun zhenyezi kalupiu rinawhale

pytorch-semseg's Issues

SyntaxError: Missing parentheses in call to 'print' when run train.py

After data download and prerequisite installation, I ran commands as below:

root@2d5d0934e049:/home/yuanshuai/code/pytorch-semseg# python train.py \                            
> --arch fcn8s \
> --dataset pascal \
> --img_rows 224 \
> --img_cols 224 \
> --n_epoch 1 \
> --batch_size 1 \
> --l_rate 1 \
> --feature_scale 2
Traceback (most recent call last):
  File "train.py", line 14, in <module>
    from ptsemseg.loader import get_loader, get_data_path
  File "/home/yuanshuai/code/pytorch-semseg/ptsemseg/loader/__init__.py", line 3, in <module>
    from ptsemseg.loader.pascal_voc_loader import pascalVOCLoader
  File "/home/yuanshuai/code/pytorch-semseg/ptsemseg/loader/pascal_voc_loader.py", line 130
    print "Pre-encoding segmentation masks..."
                                             ^
SyntaxError: Missing parentheses in call to 'print'

Any advice or suggestions, thanks in advance!

About ade20k_loader

Hi, thank you for your code!

I want to use the ptsemseg/loader/ade20k_loader.py, in the function:

    def encode_segmap(self, mask):
        # Refer : http://groups.csail.mit.edu/vision/datasets/ADE20K/code/loadAde20K.m
        mask = mask.astype(int)
        label_mask = np.zeros((mask.shape[0], mask.shape[1]))
        label_mask = ( mask[:,:,0] / 10.0 ) * 256 + mask[:,:,1]
        return np.array(label_mask, dtype=np.uint8)

the label_mask is with elements that larger than 256, so why do you use dtype=np.uint8 (which means mask[:,:,0] / 10.0 ) * 256 will be discarded).

Has someone trained successfully? The loss converges to 0.X , but the segmentation effect is poor

model： fcn8s
default parameter, n_epoch=50

when epoch=2，mean IoU=0.19
epoch=5，meanIoU=0.425
epoch=6,7,8....49,mean IoU acc and other metrics unchanged

Has someone trained successfully?

fcn input image padding

All three classes fcn8s fcn16s fcn32s has padding of 100 on the first conv layer, why use such a large padding?

ade20k training error

File "/home/yakir/git/pytorch-semseg/train.py", line 107, in
train(args)

File "/home/yakir/git/pytorch-semseg/train.py", line 61, in train
loss = cross_entropy2d(outputs, labels, size_average=True)

File "ptsemseg/loss.py", line 18, in cross_entropy2d
loss /= mask.data.sum()

RuntimeError: cuda runtime error (59) : device-side assert triggered at /home/soumith/local/builder/wheel/pytorch-src/torch/lib/THC/generated/../generic/THCTensorMathReduce.cu:262

any idea what this is about?

Performance of frrnA on Cityscapes

Thank you for providing the code! I ran the frrnA model on Cityscapes and was hoping to replicate the results reported in the paper (FRRN A: 0.630 mean IoU),
python train.py --arch frrnA --dataset cityscapes --img_rows 256 --img_cols 512 --n_epoch 35 --batch_size 3 --l_rate 0.001
And then I continue training with learning rate 1e-4 for 10 epochs.

This is roughly the same training setup and hyperparameters as in the paper. However, I only got 0.50 mean IoU.
I'm wondering if I'm doing something wrong. Were you able to replicate the results?

AttributeError: 'torch.FloatTensor' object has no attribute 'ndim'

Howdy folks

relatively new to Python, numpy, PyTorch (come from C++ and Matlab).

I am using CUDA 9.1, Python 3.6, Torch 0.3.0.post4, running on Ubuntu 16.04 LTS and I am starting off as follows:

python train.py --arch segnet --dataset pascal --visdom True

And I get the error below when I want to invoke visdom (i.e. when I don't include --visdom True, I don't have this issue).

  File "/home/bart/anaconda3/envs/pytorch-semseg/lib/python3.6/site-packages/visdom/__init__.py", line 438, in line
    assert Y.ndim == 1 or Y.ndim == 2, 'Y should have 1 or 2 dim'
AttributeError: 'torch.FloatTensor' object has no attribute 'ndim'

Any suggestions anyone?

Thanks in advance

Galto

"pre_encoded" directory in Pascal dataset

Could you please give me a hint what the pre_encoded folder/option in pascal_voc_loader.py refers to?
I do not have that folder or any .mat file that the loader wants to read from it.
Thanks.

PASCAL VOC MemoryError

Implement Polynomial learning rate decay

Implement poly learning rate policy as described in PSPNet

lr = base_lr * (1 −iter / maxiter ) ^ power

With base learning rate set to 0.01 and power set to 0.9

Pretrained models?

Hi,
It would be super if pretrained models were provided!

Thank you!
Michael

Doesn't train actually.

Many thanks for great opensource implementation of the semantic segmentation in pytorch ever!

I'm trying to proceed through training 'segnet' model on 'pascal' dataset.
What I've done:

installed pytorch: 0.2.0_4 and python: 2.7.13
downloaded VOCtrainval_11-May-2012.tar from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#Development%20Kit
downloaded "Semantic Boundaries Dataset and Benchmark" from http://home.bharathh.info/pubs/codes/SBD/download.html
as stated in Readme, extracted and pointed to them in config.json file
started training process

started visdom server

python -m visdom.server

started training as:

python train.py --arch segnet --dataset pascal

training successfully started and going looks well:

after completion, generated 100 segnet_pascal_1_%2d.pkl files

So, after that I'm trying to test newly trained model on the simple pictures:

python test.py --model_path segnet_pascal_1_99.pkl --dataset pascal --img_path 2007_000033.jpg --out_path result_33.jpg

But result is quite wrong:

for some reasons, output resolution differ and segmentation was not produced correctly.

Could you please give me some advises what I'm doing wrong?

Many thanks,
Ivan

LinkNet implementation working?

Hello, was excited to see you reimplemented LinkNet in PyTorch.

Can you verify whether the model has been tested? I ran into a few issues, including that linknet is not included in the get_model() function in models/init.py and also this line in model/utils.py is broken. I seem to have fixed these two but am running into Cuda bad params world. I wanted to ask first whether or not it's been tested (and if you could put me to a script that works with it) before diving in deeper. Thanks!

Pretrained models with ready to use demos

Lets list here working pretrained models with YouTube demo videos if possible.

Please share download links on Google Drive or Dropbox, their download can be automated easily.

I would like to create live demos/tests like this, with combinations of segmentation models and training datasets:

SSD Tensorflow based car detection and tracking demo for OSSDC.org VisionBasedACC PS3/PS4 simulator
https://youtu.be/dqnjHqwP68Y

Code ready to run, for free, in Google Colaboratory, with GPU acceleration at over 15 FPS on 720p YouTube videos:
https://github.com/OSSDC/OSSDC-VisionBasedACC/blob/master/object_detection/ossdc_vbacc_object_detection_notebook_colaboratory.ipynb

@meetshah1995 could you please share the pretrained model you used to generate the results in:
https://meetshah1995.github.io/semantic-segmentation/deep-learning/pytorch/visdom/2017/06/01/semantic-segmentation-over-the-years.html

Regarding segnet

Hi,
Regarding segnet upsasmpling segnetup3(), segnetup2(). shouldn't the code be

class segnetUp2(nn.Module):
def init(self, in_size, out_size):
super(segnetUp2, self).init()
self.unpool = nn.MaxUnpool2d(2, 2)
self.conv1 = conv2DBatchNormRelu(in_size, in_size, 3, 1, 1)
self.conv2 = conv2DBatchNormRelu(in_size, out_size, 3, 1, 1)

def forward(self, inputs, indices, output_shape):
    outputs = self.unpool(input=inputs, indices=indices, output_size=output_shape)
    outputs = self.conv1(outputs)
    outputs = self.conv2(outputs)
    return outputs

class segnetUp3(nn.Module):
def init(self, in_size, out_size):
super(segnetUp3, self).init()
self.unpool = nn.MaxUnpool2d(2, 2)
self.conv1 = conv2DBatchNormRelu(in_size, in_size, 3, 1, 1)
self.conv2 = conv2DBatchNormRelu(int_size, in_size, 3, 1, 1)
self.conv3 = conv2DBatchNormRelu(in_size, out_size, 3, 1, 1)

Because the Decoder Architecture according to the program is
Deconv3-512X3->Deconv3-256X3--->Deconv3-128X3--->Deconv3-64X2--->Deconv3-21x2

But the actual architecture is
Deconv3-512X3->Deconv3-512X3--->Deconv3-256X3--->Deconv3-128X2--->Deconv3-64x2

Bug in U-Net

Two things:

The layers being concatenated aren't as per the paper ( concat before maxpooling ).
Add support for deconvolution.

ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /update (Caused by <class 'socket.error'>

Exception in user code:

Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/visdom/init.py", line 262, in _send
data=json.dumps(msg),
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /update (Caused by <class 'socket.error'>: [Errno 111] Connection refused)

When i try to run train.py，i meet this error，can you give me some suggestion？

weight initialization

Is it possible initialize the network (s) with nnint package. For example Unet. As of now the network is initialized with random weights, however xavier or he initialization may be a good choice.

Any ideas if it possible.

For simple networks
self.conv1 = nn.Conv2d(5, 10, (3, 3))
nninit.xavier_uniform(self.conv1.weight, gain=np.sqrt(2))
nninit.constant(self.conv1.bias, 0.1)

However in the code you have shared conv1 is defined as string of operation.
self.conv1 = nn.Sequential(nn.Conv2d(in_size, out_size, 3, 1, 0) nn.BatchNorm2d(out_size),nn.ReLU(),)

would nninit work

More faithful implementation of U-net?

Hi, thank you for a great work @meetshah1995 ! I discovered this repo while searching for U-net in pytorch.

While reviewing the code, I've found that there are some subtle differences than the reference U-net model in the paper (as mentioned in #1).

making conv and maxpool in downsampling separate variables would enable concatenation of upsampling layer and the tensor before maxpool.

conv1 = self.conv1(inputs)
maxpool1 = self.maxpool1(conv1)
conv2 = self.conv2(maxpool1)
maxpool2 = self.maxpool2(conv2)
conv3 = self.conv3(maxpool2)
maxpool3 = self.maxpool3(conv3)
conv4 = self.conv4(maxpool3)
maxpool4 = self.maxpool4(conv4)
center = self.center(maxpool4)

# upsampling with skip connection from downsampling layers
up4 = self.up_concat4(conv4, center)
up3 = self.up_concat3(conv3, up4)
up2 = self.up_concat2(conv2, up3)
up1 = self.up_concat1(conv1, up2)

I think the paper doesn't use padding in conv (unetConv2 class in the code). changing the padding parameter from 1 to 0 would fix the concatenation dimension mismatch error too.

        if is_batchnorm:
            self.conv1 = nn.Sequential(nn.Conv2d(in_size, out_size, 3, 1, 0),
                                       nn.BatchNorm2d(out_size),
                                       nn.ReLU(),)
            self.conv2 = nn.Sequential(nn.Conv2d(out_size, out_size, 3, 1, 0),
                                       nn.BatchNorm2d(out_size),
                                       nn.ReLU(),)
        else:
            self.conv1 = nn.Sequential(nn.Conv2d(in_size, out_size, 3, 1, 0),
                                       nn.ReLU(),)
            self.conv2 = nn.Sequential(nn.Conv2d(out_size, out_size, 3, 1, 0),
                                       nn.ReLU(),)

Assuming that the input image is square, cropping (i.e. F.pad with negative values for crop) would be
padding = 2 * [offset // 2, offset // 2] ?
the stride size in unetUp class is currently 1, but I think it should be 2 (which means upsample by factor 2)

if is_deconv:
            # fix: add stride parameter of 2 (which is upsample by 2)
            self.up = nn.ConvTranspose2d(in_size, out_size, kernel_size=2, stride=2)
        else:
            self.up = nn.UpsamplingBilinear2d(scale_factor=2)

Currently testing the modified model in my repo and seeing the expected behavior with CT dataset.

What's your thoughts? would PR if you think it's right. Thanks

UserWarning: self and mask not broadcastable, but have the same number of elements

Hi,
I am using pytorch 0.20 to train my network. Due to the broadcasting change in this version, it usually comes out warning:

~/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py:468: UserWarning: self and mask not broadcastable, but have the same number of elements.  Falling back to deprecated pointwise behavior.
  return tensor.masked_select(mask)
~/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py:416: UserWarning: mask is not broadcastable to self, but they have the same number of elements.  Falling back to deprecated pointwise behavior.
  return tensor1.masked_scatter_(mask, tensor2)

Error: assert Y.ndim == 1 or Y.ndim == 2, 'Y should have 1 or 2 dim'

Hello there!

I am having a huge struggle installing pytorch-semseg. Is it supported on windows x64 ?
I've installed everything on Windows with pytorch for python 3.6 and adapted every print method to python 3 in source code:

(pytorch) semseg>pip install -r requirements.txt
Requirement already satisfied: matplotlib==2.0.0 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 1))
Requirement already satisfied: numpy==1.12.1 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 2))
Requirement already satisfied: scipy in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 3))
Requirement already satisfied: torchvision==0.1.7 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 4))
Requirement already satisfied: tqdm==4.11.2 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 5))
Requirement already satisfied: visdom==0.1.1 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 6))
Requirement already satisfied: certifi in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from -r requirements.txt (line 7))
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=1.5.6 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from matplotlib==2.0.0->-r requirements.txt (line 1))
Requirement already satisfied: pytz in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from matplotlib==2.0.0->-r requirements.txt (line 1))
Requirement already satisfied: cycler>=0.10 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from matplotlib==2.0.0->-r requirements.txt (line 1))
Requirement already satisfied: six>=1.10 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from matplotlib==2.0.0->-r requirements.txt (line 1))
Requirement already satisfied: python-dateutil in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from matplotlib==2.0.0->-r requirements.txt (line 1))
Requirement already satisfied: torch in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from torchvision==0.1.7->-r requirements.txt (line 4))
Requirement already satisfied: pillow in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from torchvision==0.1.7->-r requirements.txt (line 4))
Requirement already satisfied: tornado in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from visdom==0.1.1->-r requirements.txt (line 6))
Requirement already satisfied: requests in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from visdom==0.1.1->-r requirements.txt (line 6))
Requirement already satisfied: pyzmq in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from visdom==0.1.1->-r requirements.txt (line 6))
Requirement already satisfied: pyyaml in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from torch->torchvision==0.1.7->-r requirements.txt (line 4))
Requirement already satisfied: olefile in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from pillow->torchvision==0.1.7->-r requirements.txt (line 4))
Requirement already satisfied: urllib3<1.23,>=1.21.1 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from requests->visdom==0.1.1->-r requirements.txt (line 6))
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from requests->visdom==0.1.1->-r requirements.txt (line 6))
Requirement already satisfied: idna<2.7,>=2.5 in c:\users\semseg\appdata\local\conda\conda\envs\pytorch\lib\site-packages (from requests->visdom==0.1.1->-r requirements.txt (line 6))

Seams good to me.

Started training and got error:

(pytorch) semseg>python train.py --arch segnet --dataset pascal --img_rows "512" --img_cols "512"
Traceback (most recent call last):
  File "train.py", line 109, in <module>
    train(args)
  File "train.py", line 36, in train
    legend=['Loss']))
  File "C:\Users\semseg\AppData\Local\conda\conda\envs\pytorch\lib\site-packages\visdom\__init__.py", line 438, in line
    assert Y.ndim == 1 or Y.ndim == 2, 'Y should have 1 or 2 dim'
AttributeError: 'torch.FloatTensor' object has no attribute 'ndim'

Can anyone tell me what i am doing wrong &/ in which constellation (python27 or python36, CUDA 7/8, cuDNN 5/6) i should install pytorch-semseg?

model preprocessing

Thanks for sharing this code.

Just a brief note that the mean subtraction used in the dataset loaders (e.g. here) (which was similarly used in the original FCN model here) is good for training models from scratch. However, if the segmentation model is initialized from a pytorch pretrained model (such as for the vgg16 model in fcn training code here), it may be preferable to perform the preprocessing used to train that model. The pytorch pretrained models use preprocessing that involves loading the input image in the range [0,1], then performing both mean subtraction and division by rgb standard deviation:

from torchvision import transforms

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

(This wouldn't have affected the original FCN caffe code, since they use a vgg16 model that was trained with a very similar mean subtraction, but may be more significant with the different pre-processing used by the pytorch models).

Mapillary Vistas Dataloader

Implement dataloader for the Mapillary Vistas dataset:
https://blog.mapillary.com/product/2017/05/03/mapillary-vistas-dataset.html

memory efficiency

I wonder what's the memory efficiency of pytorch comparing to caffe, in image segmentation.
Will pytorch be more memory efficient?

Validation mIoU metric monitor

Port tensorflow code (from private repo) / write code to show mIoU over validation dataset every few epochs.

Module Not Present

running train.py throws the following error
Throwback :

Traceback (most recent call last):
  File "train.py", line 13, in <module>
    from ptsemseg.models import get_model
  File "/home/du4/14CS30025/sangeet/segmentation/pytorch-semseg/ptsemseg/models/__init__.py", line 8, in <module>
    from ptsemseg.models.frrn import *
ImportError: No module named frrn

Pascal VOC 2012

Have you balanced\weighted the classes for Pascal VOC 2012?
https://discuss.pytorch.org/t/multilabel-classification-under-unbalanced-class-distributions/2950

I can't see any specific code here:
https://github.com/meetshah1995/pytorch-semseg/blob/master/ptsemseg/loader/pascal_voc_loader.py

PSPNet Implementation - WIP

@krishnavishalv is working on an implementation of the PSPNet which:

Uses dilated version of Resnet.
Uses distributed batchnorm.
Uses memory optimization as mentioned in Training Deep Nets with Sublinear Memory Cost and implemented in mxnet-memonger

This chainer implementation by @mitmul looks a good point to start.

Size inconsistency in U-Net implementation.

When i train the unet model，i got this error：
RuntimeError: inconsistent tensor sizes at/b/wheel/pytorchsrc/torch/lib/THC/generic/THCTensorMath.cu:141

my input image size is 256*256

Batch_size larger than one drops the accuracy

@meetshah1995 @ibadami Thanks for making these codes publicly available.
Have you tried training with a batch_size larger than one? Why does it degrade the accuracy? I wonder if the cross_entropy2D messes up the batch orderings.

Is the model linknet the same with pspnet?

I am wondering if it is a mistake that the model linknet is the same with pspnet.

iisue in testing script

I have trained segnet on camvid dataset and while testing I came across the following issue

python test.py --model_path segnet_camvid_1_8.pkl --dataset camvid --img_path /home/ubuntu/workspace/SegNet-Tutorial/CamVid/val/0016E5_08105.png --out_path out.png
Read Input Image from : /home/ubuntu/workspace/SegNet-Tutorial/CamVid/val/0016E5_08105.png
Traceback (most recent call last):
File "test.py", line 67, in
test(args)
File "test.py", line 50, in test
pred = np.squeeze(outputs.data.max(1)[1].cpu().numpy(), axis=1)
File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 1198, in squeeze
return squeeze(axis=axis)
ValueError: cannot select an axis to squeeze out which has size not equal to one

Could someone please provide workaround for the issue. Thank you.

possible missing funcction in camvid_loader.py

Dear,
Thank you so much for the wonderful code.
Is that the following functions required by camvid_loader.py?
Kind Regards,
Donghao

def get_camvid_labels(self):
return np.asarray([[128, 128, 128], [128, 0, 0], [192, 192, 128], [255, 69, 0], [128, 64, 128], [60, 40, 222], [128, 128, 0], [192, 128, 128], [64, 64, 128], [64, 0, 128], [64, 64, 0], [0, 128, 192], [0, 0, 0]])

def encode_segmap(self, mask):
    mask = mask.astype(int)
    label_mask = np.zeros((mask.shape[0], mask.shape[1]), dtype=np.int16)
    for i, label in enumerate(self.get_camvid_labels()):
        label_mask[np.where(np.all(mask == label, axis=-1))[:2]] = i
    label_mask = label_mask.astype(int)
    return label_mask

Cityscape data loader is not implemented - WIP

In your wiki, it states that cityscape dataset loader is implemented but it is not in the repo :)
Thank you for the rest of the code though 👍

AttributeError: 'numpy.ndarray' object has no attribute 'unsqueeze'

I am having big time troubles running this A.I. with CUDA 7.5 and cudnn 6 on Ubuntu 16.04 with python 2.7:

python train.py --arch segnet --batch_size 1 --img_rows 469 --img_cols 469 --dataset pascal
True: print(torch.backends.cudnn.is_acceptable(torch.cuda.FloatTensor(0)))
6021: print(torch.backends.cudnn.version())
Cuda available.: torch.cuda.is_available()

Traceback (most recent call last):
  File "train.py", line 108, in <module>
    train(args)
  File "train.py", line 41, in train
    test_image = Variable(test_image.unsqueeze(0).cuda(0))
AttributeError: 'numpy.ndarray' object has no attribute 'unsqueeze'

Any help would be much apreciated.

Modify Loss function

Categorical Crossentropy 2D implemented currently assumes 0th index as the default background (ignore label). Make the ignore label as configurable argument and an attribute of the dataset.

Feature scaling in Linknet

Hi,
Thanks for your work first of all. I was just wondering have you got any speed gain because of feature_scale you have used in linknet which reduces the filter size specified in the actual linknet paper. If not, could you explain why did you use feature_scale

The linknet performance?

Can you reproduce the mIoU of 76% in the linknet paper?
I trained it more than 500 epoches, but only get the less than 60% mIoU.

Max retries exceeded with url (Caused by NewConnectionError)

Use Pre-trained ResNet in LinkNet modules

Current Implementation doesn't use pre-trained ResNet from torchvision.models, modify code to use pretrained ResNet directly.

Camvid Segnet unlabelled problem

I use this code to train segnet on camvid, but the results can't reach the paper performance.
I found that you didn't use median-frequency balancing to training.
Moreover, unlabelled class is '11', not '<0', so you didn't ignore unlabelled pixel at training and validate.
Your segnet will output unlabelled class that is weird. You should change n_class=12 to n_class=11.

low miou when using pspnet and voc dataset

Hi, thanks for your excellent work!

But I got low miou when I use pspnet and load pretrained model on pascal voc 2012 val dataset.

There are my results:
Overall Acc: 0.651600544826
Mean IoU : 0.0329240353333
Mean Acc : 0.0462086706718
FreqW Acc : 0.495069499643

and when I validate the same data on original caffe version, I got miou about 91%

Could you help me to solve this problem? Thanks alot!

test.py error

An error occurs with code below: test.py
pred = np.squeeze(outputs.data.max(1)[1].cpu().numpy(), axis=1)
decoded = loader.decode_segmap(pred[0]).

I use python3.5, pytorch 0.2.0 and numpy 1.13.1.
It works with changes below:
pred = np.squeeze(outputs.data.max(1)[1].cpu().numpy(), axis=0)
decoded = loader.decode_segmap(pred)

what parameter value do you set in training with pascal voc dataset?

Reporting benchmark scores?

Hey,

Nice efforts, are you maintaining the scores for all models with benchmark metrics? I am interested to know what it looks like for models you have implemented.

Thanks.

Then run PYTHONPATH=. python ptsemseg/models/pspnet.py for 5 times, the output file frankfurt_tiled.png will be different at every run.

I'm suspecting the caffe_pb2 weight loader to be the culprit.

Any method test validate data on pspnet using pretrained caffe model?

Hi! Is there any method to evaluate your pspnet using voc2012 val data and pretrained caffe model?

Thanks alot!