GithubHelp home page GithubHelp logo

qubvel / segmentation_models.pytorch Goto Github PK

View Code? Open in Web Editor NEW
8.8K 78.0 1.6K 2.71 MB

Segmentation models with pretrained backbones. PyTorch.

License: MIT License

Python 99.84% Dockerfile 0.02% Makefile 0.14%
segmentation image-processing pspnet linknet unet unet-pytorch pytorch fpn models imagenet

segmentation_models.pytorch's Introduction

logo
Python library with Neural Networks for Image
Segmentation based on PyTorch.

Generic badge GitHub Workflow Status (branch) Read the Docs
PyPI PyPI - Downloads
PyTorch - Version Python - Version

The main features of this library are:

  • High level API (just two lines to create a neural network)
  • 9 models architectures for binary and multi class segmentation (including legendary Unet)
  • 124 available encoders (and 500+ encoders from timm)
  • All encoders have pre-trained weights for faster and better convergence
  • Popular metrics and losses for training routines

Visit Read The Docs Project Page or read following README to know more about Segmentation Models Pytorch (SMP for short) library

πŸ“‹ Table of content

  1. Quick start
  2. Examples
  3. Models
    1. Architectures
    2. Encoders
    3. Timm Encoders
  4. Models API
    1. Input channels
    2. Auxiliary classification output
    3. Depth
  5. Installation
  6. Competitions won with the library
  7. Contributing
  8. Citing
  9. License

⏳ Quick start

1. Create your first Segmentation model with SMP

Segmentation model is just a PyTorch nn.Module, which can be created as easy as:

import segmentation_models_pytorch as smp

model = smp.Unet(
    encoder_name="resnet34",        # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights="imagenet",     # use `imagenet` pre-trained weights for encoder initialization
    in_channels=1,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=3,                      # model output channels (number of classes in your dataset)
)
  • see table with available model architectures
  • see table with available encoders and their corresponding weights

2. Configure data preprocessing

All encoders have pretrained weights. Preparing your data the same way as during weights pre-training may give you better results (higher metric score and faster convergence). It is not necessary in case you train the whole model, not only decoder.

from segmentation_models_pytorch.encoders import get_preprocessing_fn

preprocess_input = get_preprocessing_fn('resnet18', pretrained='imagenet')

Congratulations! You are done! Now you can train your model with your favorite framework!

πŸ’‘ Examples

  • Training model for pets binary segmentation with Pytorch-Lightning notebook and Open In Colab
  • Training model for cars segmentation on CamVid dataset here.
  • Training SMP model with Catalyst (high-level framework for PyTorch), TTAch (TTA library for PyTorch) and Albumentations (fast image augmentation library) - here Open In Colab
  • Training SMP model with Pytorch-Lightning framework - here (clothes binary segmentation by @ternaus).

πŸ“¦ Models

Architectures

Encoders

The following is a list of supported encoders in the SMP. Select the appropriate family of encoders and click to expand the table and select a specific encoder and its pre-trained weights (encoder_name and encoder_weights parameters).

ResNet
Encoder Weights Params, M
resnet18 imagenet / ssl / swsl 11M
resnet34 imagenet 21M
resnet50 imagenet / ssl / swsl 23M
resnet101 imagenet 42M
resnet152 imagenet 58M
ResNeXt
Encoder Weights Params, M
resnext50_32x4d imagenet / ssl / swsl 22M
resnext101_32x4d ssl / swsl 42M
resnext101_32x8d imagenet / instagram / ssl / swsl 86M
resnext101_32x16d instagram / ssl / swsl 191M
resnext101_32x32d instagram 466M
resnext101_32x48d instagram 826M
ResNeSt
Encoder Weights Params, M
timm-resnest14d imagenet 8M
timm-resnest26d imagenet 15M
timm-resnest50d imagenet 25M
timm-resnest101e imagenet 46M
timm-resnest200e imagenet 68M
timm-resnest269e imagenet 108M
timm-resnest50d_4s2x40d imagenet 28M
timm-resnest50d_1s4x24d imagenet 23M
Res2Ne(X)t
Encoder Weights Params, M
timm-res2net50_26w_4s imagenet 23M
timm-res2net101_26w_4s imagenet 43M
timm-res2net50_26w_6s imagenet 35M
timm-res2net50_26w_8s imagenet 46M
timm-res2net50_48w_2s imagenet 23M
timm-res2net50_14w_8s imagenet 23M
timm-res2next50 imagenet 22M
RegNet(x/y)
Encoder Weights Params, M
timm-regnetx_002 imagenet 2M
timm-regnetx_004 imagenet 4M
timm-regnetx_006 imagenet 5M
timm-regnetx_008 imagenet 6M
timm-regnetx_016 imagenet 8M
timm-regnetx_032 imagenet 14M
timm-regnetx_040 imagenet 20M
timm-regnetx_064 imagenet 24M
timm-regnetx_080 imagenet 37M
timm-regnetx_120 imagenet 43M
timm-regnetx_160 imagenet 52M
timm-regnetx_320 imagenet 105M
timm-regnety_002 imagenet 2M
timm-regnety_004 imagenet 3M
timm-regnety_006 imagenet 5M
timm-regnety_008 imagenet 5M
timm-regnety_016 imagenet 10M
timm-regnety_032 imagenet 17M
timm-regnety_040 imagenet 19M
timm-regnety_064 imagenet 29M
timm-regnety_080 imagenet 37M
timm-regnety_120 imagenet 49M
timm-regnety_160 imagenet 80M
timm-regnety_320 imagenet 141M
GERNet
Encoder Weights Params, M
timm-gernet_s imagenet 6M
timm-gernet_m imagenet 18M
timm-gernet_l imagenet 28M
SE-Net
Encoder Weights Params, M
senet154 imagenet 113M
se_resnet50 imagenet 26M
se_resnet101 imagenet 47M
se_resnet152 imagenet 64M
se_resnext50_32x4d imagenet 25M
se_resnext101_32x4d imagenet 46M
SK-ResNe(X)t
Encoder Weights Params, M
timm-skresnet18 imagenet 11M
timm-skresnet34 imagenet 21M
timm-skresnext50_32x4d imagenet 25M
DenseNet
Encoder Weights Params, M
densenet121 imagenet 6M
densenet169 imagenet 12M
densenet201 imagenet 18M
densenet161 imagenet 26M
Inception
Encoder Weights Params, M
inceptionresnetv2 imagenet / imagenet+background 54M
inceptionv4 imagenet / imagenet+background 41M
xception imagenet 22M
EfficientNet
Encoder Weights Params, M
efficientnet-b0 imagenet 4M
efficientnet-b1 imagenet 6M
efficientnet-b2 imagenet 7M
efficientnet-b3 imagenet 10M
efficientnet-b4 imagenet 17M
efficientnet-b5 imagenet 28M
efficientnet-b6 imagenet 40M
efficientnet-b7 imagenet 63M
timm-efficientnet-b0 imagenet / advprop / noisy-student 4M
timm-efficientnet-b1 imagenet / advprop / noisy-student 6M
timm-efficientnet-b2 imagenet / advprop / noisy-student 7M
timm-efficientnet-b3 imagenet / advprop / noisy-student 10M
timm-efficientnet-b4 imagenet / advprop / noisy-student 17M
timm-efficientnet-b5 imagenet / advprop / noisy-student 28M
timm-efficientnet-b6 imagenet / advprop / noisy-student 40M
timm-efficientnet-b7 imagenet / advprop / noisy-student 63M
timm-efficientnet-b8 imagenet / advprop 84M
timm-efficientnet-l2 noisy-student 474M
timm-efficientnet-lite0 imagenet 4M
timm-efficientnet-lite1 imagenet 5M
timm-efficientnet-lite2 imagenet 6M
timm-efficientnet-lite3 imagenet 8M
timm-efficientnet-lite4 imagenet 13M
MobileNet
Encoder Weights Params, M
mobilenet_v2 imagenet 2M
timm-mobilenetv3_large_075 imagenet 1.78M
timm-mobilenetv3_large_100 imagenet 2.97M
timm-mobilenetv3_large_minimal_100 imagenet 1.41M
timm-mobilenetv3_small_075 imagenet 0.57M
timm-mobilenetv3_small_100 imagenet 0.93M
timm-mobilenetv3_small_minimal_100 imagenet 0.43M
DPN
Encoder Weights Params, M
dpn68 imagenet 11M
dpn68b imagenet+5k 11M
dpn92 imagenet+5k 34M
dpn98 imagenet 58M
dpn107 imagenet+5k 84M
dpn131 imagenet 76M
VGG
Encoder Weights Params, M
vgg11 imagenet 9M
vgg11_bn imagenet 9M
vgg13 imagenet 9M
vgg13_bn imagenet 9M
vgg16 imagenet 14M
vgg16_bn imagenet 14M
vgg19 imagenet 20M
vgg19_bn imagenet 20M
Mix Vision Transformer

Backbone from SegFormer pretrained on Imagenet! Can be used with other decoders from package, you can combine Mix Vision Transformer with Unet, FPN and others!

Limitations:

  • encoder is not supported by Linknet, Unet++
  • encoder is supported by FPN only for encoder depth = 5
Encoder Weights Params, M
mit_b0 imagenet 3M
mit_b1 imagenet 13M
mit_b2 imagenet 24M
mit_b3 imagenet 44M
mit_b4 imagenet 60M
mit_b5 imagenet 81M
MobileOne

Apple's "sub-one-ms" Backbone pretrained on Imagenet! Can be used with all decoders.

Note: In the official github repo the s0 variant has additional num_conv_branches, leading to more params than s1.

Encoder Weights Params, M
mobileone_s0 imagenet 4.6M
mobileone_s1 imagenet 4.0M
mobileone_s2 imagenet 6.5M
mobileone_s3 imagenet 8.8M
mobileone_s4 imagenet 13.6M

* ssl, swsl - semi-supervised and weakly-supervised learning on ImageNet (repo).

Timm Encoders

docs

Pytorch Image Models (a.k.a. timm) has a lot of pretrained models and interface which allows using these models as encoders in smp, however, not all models are supported

  • not all transformer models have features_only functionality implemented that is required for encoder
  • some models have inappropriate strides

Total number of supported encoders: 549

πŸ” Models API

  • model.encoder - pretrained backbone to extract features of different spatial resolution
  • model.decoder - depends on models architecture (Unet/Linknet/PSPNet/FPN)
  • model.segmentation_head - last block to produce required number of mask channels (include also optional upsampling and activation)
  • model.classification_head - optional block which create classification head on top of encoder
  • model.forward(x) - sequentially pass x through model`s encoder, decoder and segmentation head (and classification head if specified)
Input channels

Input channels parameter allows you to create models, which process tensors with arbitrary number of channels. If you use pretrained weights from imagenet - weights of first convolution will be reused. For 1-channel case it would be a sum of weights of first convolution layer, otherwise channels would be populated with weights like new_weight[:, i] = pretrained_weight[:, i % 3] and than scaled with new_weight * 3 / new_in_channels.

model = smp.FPN('resnet34', in_channels=1)
mask = model(torch.ones([1, 1, 64, 64]))
Auxiliary classification output

All models support aux_params parameters, which is default set to None. If aux_params = None then classification auxiliary output is not created, else model produce not only mask, but also label output with shape NC. Classification head consists of GlobalPooling->Dropout(optional)->Linear->Activation(optional) layers, which can be configured by aux_params as follows:

aux_params=dict(
    pooling='avg',             # one of 'avg', 'max'
    dropout=0.5,               # dropout ratio, default is None
    activation='sigmoid',      # activation function, default is None
    classes=4,                 # define number of output labels
)
model = smp.Unet('resnet34', classes=4, aux_params=aux_params)
mask, label = model(x)
Depth

Depth parameter specify a number of downsampling operations in encoder, so you can make your model lighter if specify smaller depth.

model = smp.Unet('resnet34', encoder_depth=4)

πŸ›  Installation

PyPI version:

$ pip install segmentation-models-pytorch

Latest version from source:

$ pip install git+https://github.com/qubvel/segmentation_models.pytorch

πŸ† Competitions won with the library

Segmentation Models package is widely used in the image segmentation competitions. Here you can find competitions, names of the winners and links to their solutions.

🀝 Contributing

Install SMP

make install_dev  # create .venv, install SMP in dev mode

Run tests and code checks

make all          # run flake8, black, tests

Update table with encoders

make table        # generate table with encoders and print to stdout

πŸ“ Citing

@misc{Iakubovskii:2019,
  Author = {Pavel Iakubovskii},
  Title = {Segmentation Models Pytorch},
  Year = {2019},
  Publisher = {GitHub},
  Journal = {GitHub repository},
  Howpublished = {\url{https://github.com/qubvel/segmentation_models.pytorch}}
}

πŸ›‘οΈ License

Project is distributed under MIT License

segmentation_models.pytorch's People

Contributors

aarsh2001 avatar abd-elr4hman avatar alaydshah avatar azkalot1 avatar calebrob6 avatar cmamba avatar daiwt avatar dependabot[bot] avatar gracikk-ds avatar ilyadobrynin avatar julienmaille avatar kaczmarj avatar kevinpl07 avatar khornlund avatar kupchanski avatar kyle1993 avatar laol777 avatar lizmisha avatar loopdigga96 avatar ludics avatar michaelmonashev avatar nitzanmadar avatar nmerty avatar qubvel avatar remram44 avatar siarheifedartsou avatar thisisiron avatar vozf avatar wamawama avatar zurk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

segmentation_models.pytorch's Issues

Image size issues

I am currently using a UNET from this package and have some weird errors with respect to image size that I don't fully understand.

If I use image size of 320x480 everything works fine, but when I switch to e.g., 350x525 I get the following error. Certain image size seem to work, certain don't seem to work. Like 640x960 works, but 160x240 does not. Any ideas?

~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/base/encoder_decoder.py in forward(self, x)
     23         """Sequentially pass `x` trough model`s `encoder` and `decoder` (return logits!)"""
     24         x = self.encoder(x)
---> 25         x = self.decoder(x)
     26         return x
     27 

~/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/unet/decoder.py in forward(self, x)
     93             encoder_head = self.center(encoder_head)
     94 
---> 95         x = self.layer1([encoder_head, skips[0]])
     96         x = self.layer2([x, skips[1]])
     97         x = self.layer3([x, skips[2]])

~/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

~/anaconda3/envs/xxx/lib/python3.7/site-packages/segmentation_models_pytorch/unet/decoder.py in forward(self, x)
     26         x = F.interpolate(x, scale_factor=2, mode='nearest')
     27         if skip is not None:
---> 28             x = torch.cat([x, skip], dim=1)
     29             x = self.attention1(x)
     30 

~/anaconda3/envs/xxx/lib/python3.7/site-packages/apex/amp/wrap.py in wrapper(seq, *args, **kwargs)
     83             cast_seq = utils.casted_args(maybe_float,
     84                                          seq, {})
---> 85             return orig_fn(cast_seq, *args, **kwargs)
     86         else:
     87             # TODO: other mixed-type cases aren't due to amp.

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 33 and 34 in dimension 3 at /opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/generic/THCTensorMath.cu:71

multiclass jaccard loss

Hi,

I am training unet for multi-class segmentation problem. There are 3 classes and I would like to update weights based on class 1 and 2 (leave out the background class 0).

I call the loss function as -

MulticlassJaccardLoss(weight=[2,10], classes=[1,2], from_logits=False)

My MulticlassJaccardLoss class-

class MulticlassJaccardLoss(_Loss):
    """Implementation of Jaccard loss for multiclass (semantic) image segmentation task
    """
    __name__ = 'mc_jaccard_loss'
    def __init__(self, classes: List[int] = None, from_logits=True, weight=[2,6], reduction='elementwise_mean'):
        super(MulticlassJaccardLoss, self).__init__(reduction=reduction)
        self.classes = classes
        self.from_logits = from_logits
        self.weight = weight

    def forward(self, y_pred: Tensor, y_true: Tensor) -> Tensor:
        """
        :param y_pred: NxCxHxW
        :param y_true: NxHxW
        :return: scalar
        """
        if self.from_logits:
            y_pred = y_pred.softmax(dim=1)

        n_classes = y_pred.size(1)
        smooth = 1e-3
        
        if self.classes is None:
            classes = range(n_classes)
        else:
            classes = self.classes
            n_classes = len(classes)

        loss = torch.zeros(n_classes, dtype=torch.float, device=y_pred.device)
        print(loss.shape)
        

        if self.weight is None:
            weights = [1] * n_classes
        else:
            weights = self.weight

        for class_index, weight in zip(classes, weights):

            jaccard_target = (y_true == class_index).float()
            jaccard_output = y_pred[:, class_index, ...]

            num_preds = jaccard_target.long().sum()

            if num_preds == 0:
                loss[class_index-1] = 0 #custom
            else:
                iou = soft_jaccard_score(jaccard_output, jaccard_target, from_logits=False, smooth=smooth)
                loss[class_index-1] = (1.0 - iou) * weight #custom

        if self.reduction == 'elementwise_mean':
            return loss.mean()

        if self.reduction == 'sum':
            return loss.sum()

        return loss

When I train the model, the model gets trained for few iterations and I get the following error,

element 0 of tensors does not require grad and does not have a grad_fn

None activation for UNet output.

It looks like it is sigmoid or softmax applied to the output layer of the UNet architecture.

It would be nice to have an option not to apply anything.

The workaround is:

 def activation(x): x

model = smp.Unet('resnet34', encoder_weights='imagenet', activation=activation)

But it would be nice to do it without this hack.

example doesn't work

I wasn't able to get the example with the cars segmentation to run.
Changing the code in unet/decoder.py for the class DecoderBlock worked for me:

def forward(self, x):
    x, skip = x
    if skip is not None:
        x = F.interpolate(x, size=(skip.shape[-2], skip.shape[-1]), mode='nearest')
        x = torch.cat([x, skip], dim=1)
    else:
        x = F.interpolate(x, scale_factor=2, mode='nearest')
    x = self.block(x)
    return x

cannot import name 'cfg' from 'torchvision.models.vgg'

Hi, there is an error.

~/anaconda3/envs/dl/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/vgg.py in <module>
      2 from torchvision.models.vgg import VGG
      3 from torchvision.models.vgg import make_layers
----> 4 from torchvision.models.vgg import cfg
ImportError: cannot import name 'cfg' from 'torchvision.models.vgg' (/Users/anaconda3/envs/dl/lib/python3.7/site-packages/torchvision/models/vgg.py)

my pytorch version is 1.2.0 and torchvision version is 0.4.0

How to use the weights after trainning?

It seems like the model should not be initialized the way the same as trainning process when testing,I tried comment the train_log in the trainning for-loop expecting testing will be done without trainning, but the visualisition results shows 0,1 inference. I guess maybe the initializition of model of ENCODER may cover the trainned weights or somehow. so how to use the trainned weights in the right way?

sincerely!

TypeError: __init__() got an unexpected keyword argument 'groups'

I downloaded new version of library from source :
pip install git+https://github.com/qubvel/segmentation_models.pytorch
And now i have this problem :
TypeError: __init__() got an unexpected keyword argument 'groups'

Full error :

  File "segmentation_model.py", line 235, in <module>
    defect_crop=args.defect_crop)
  File "segmentation_model.py", line 166, in train
    model = get_model(model_name=model_name, encoder_name=encoder).to(device)
  File "segmentation_model.py", line 98, in get_model
    model = FPN(encoder_name=encoder_name, classes=4, activation='sigmoid', encoder_weights=encoder_weights)
  File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/fpn/model.py", line 39, in __init__
    encoder_weights=encoder_weights
  File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/__init__.py", line 24, in get_encoder
    encoder = Encoder(**encoders[name]['params'])
  File "/home/andrii/.conda/envs/dev/lib/python3.7/site-packages/segmentation_models_pytorch/encoders/resnet.py", line 10, in __init__
    super().__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'groups'

Any difference between sigmoid and softmax activation when using f_score and IOU metrics?

I am using UNet and res50 encoder now working on medical dataset with 7 classes. I noticed that my IOU score and f_score is pretty low during the whole training stage. Maybe it's because there exists class imbalance in my dataset and sometimes the background(which does not belongs to any classes) dominates an input image. Thanks for your work and I have 2 questions here, should I consider pixels with no labels as a class? Because in pixel_CE loss the target array should be 1 channel and all BG pixeles should be labeled with 0(a class number). Another question is what loss funciton should I use considering my dataset is unique and the IOU and f_score keeps low.. and how does the selection of activation functions influence the calculation of iou and f_score?
Thanks

Question about activation in example

In your example you use Sigmoid activation at the end of the Unet. But at the same time all your losses and metrics are counted using one more sigmoid activation. Is it appropriate to use sigmoid activation twice?

[Request] EfficientNet as a encoder

First, I would like to thank you for making a great project.
I found that you added EfficientNet to your Keras project and I was wondering if you could add that to this Pytorch project too?

Which dataset do you use in example.ipynb?

Hi, friend , I'm new to semantic segmentation , so I have to understand and test every step in your example code cars segmentation (camvid).ipynb,
Can you tell me the dataset's name , so I can download it and run this example convenient.
Thanks for your help.

confused by loss function forward params

Hi qubvel, i am confused by the i was confused by loss function forward params.

As show in utils/functions.py, the def iou and f_score calculate IOU loss and DICE loss with pr and gt as params, and does it means pr (torch.Tensor) is a tensor with shape[batch, channel, width, height]? but why the annotation is "A list of predicted elements" as follows, i'm confused with what is the pr, tensor, with shape[batch, channel, width, height] or list with element num batch * channel * width * height, which one?

Bug with `import segmentation_models_pytorch as smp`

 import segmentation_models_pytorch as smp                                                                                                                                                                  
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-b9e13fa886e0> in <module>
----> 1 import segmentation_models_pytorch as smp

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/__init__.py in <module>
----> 1 from .unet import Unet
      2 from .linknet import Linknet
      3 from .fpn import FPN
      4 from .pspnet import PSPNet
      5 

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/unet/__init__.py in <module>
----> 1 from .model import Unet

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/unet/model.py in <module>
      1 from .decoder import UnetDecoder
      2 from ..base import EncoderDecoder
----> 3 from ..encoders import get_encoder
      4 
      5 

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/encoders/__init__.py in <module>
      3 from .resnet import resnet_encoders
      4 from .dpn import dpn_encoders
----> 5 from .vgg import vgg_encoders
      6 from .senet import senet_encoders
      7 from .densenet import densenet_encoders

~/anaconda3/lib/python3.6/site-packages/segmentation_models_pytorch/encoders/vgg.py in <module>
      2 from torchvision.models.vgg import VGG
      3 from torchvision.models.vgg import make_layers
----> 4 from torchvision.models.vgg import cfg
      5 from pretrainedmodels.models.torchvision_models import pretrained_settings
      6 

ImportError: cannot import name 'cfg'

How to

Sorry for wrongly clicked, my issue would be stated other where.

Use concatenation for feature pyramid aggregation?

Hi! Thanks for repo owner's contribution! This repository is useful and benefits lots of people!

I would like to discuss the implementation of FPN in this repo with the people watching on this repo.
According to this document, I think page 25 suggesting that we should use concatenation instead of summation if I did not misunderstand the page.


    def forward(self, x):
        c5, c4, c3, c2, _ = x

        p5 = self.conv1(c5)
        p4 = self.p4([p5, c4])
        p3 = self.p3([p4, c3])
        p2 = self.p2([p3, c2])

        s5 = self.s5(p5)
        s4 = self.s4(p4)
        s3 = self.s3(p3)
        s2 = self.s2(p2)
       
        # use concatenation instead of summation?
        # x = s5 + s4 + s3 + s2
        x = torch.cat([s5, s4, s3, s2], dim=1)

        x = self.dropout(x)
        x = self.final_conv(x)

        x = F.interpolate(x, scale_factor=4, mode='bilinear', align_corners=True)
        return x

IOU metric sometimes bigger than 1

I have slightly modified script from car segmentation ipynb file for my own binary segmentation mask:

If I use same training loop for resnet 18 I get very high IUO even higher than 1 which impossible.

Using evaluation gives:

test_epoch = ValidEpoch(
    model=best_model,
    loss=loss,
    metrics=metrics,
    device=DEVICE,
)
 40/40 [00:06<00:00,  7.23it/s, bce_dice_loss - -3.208e+03, iou - 1.163, f-score - 0.9803] 

But when I manually score on validation - much much lower result:

for i in range(40):
    n = i#np.random.choice(len(valid_dataset))
    image_vis = imread(valid_dataset.images_fps[n])    
    image, gt_mask = valid_dataset[n]
    image, gt_mask = image, gt_mask.transpose(1,2,0)
    gt_mask = gt_mask.squeeze()
    
    x_tensor = torch.from_numpy(image).to(DEVICE).unsqueeze(0)
    pr_mask = best_model.predict(x_tensor)
    pr_mask = (pr_mask.squeeze().cpu().numpy() > 0.5).astype(np.uint8)
    score = iou(torch.from_numpy(gt_mask).float().to(DEVICE), torch.from_numpy(pr_mask).float().to(DEVICE)).data.cpu().numpy().max()
    #rint(score)
    scores.append(score)
    IOU 0.07_

Then I tried to change preprocessing to recalling by dividing by 255 - model just learn - IOU near 0.0007 - with almost same result if I measure it manually.

Using Keras achieved almost IOU 0.5 with same almost same pipeline

confusion with the train and valid operations

Hello, thanks for sharing the code, it's rather convenient to deal with the segmentation tasks.
When I use it for training my model, I have a confusion with the train and valid operations.
As the following said,

train_logs = train_epoch.run(train_loader)
 valid_logs = valid_epoch.run(valid_loader)

since the train_epoch and valid epoch are created from smp.utils.train.ValidEpoch and smp.utils.train.TrainEpoch separately, however, the valid_epoch instance could use the weights obtained by train_epoch instance for validation. Maybe I ignore some key points.

So how do they share the model weight with each other?

Warm Regard.

confused by BCEDiceLoss

You are using dice + bce . But your dice is calculated as 1 - F.f_score. Should it be 1 - F.dice_coef ?

error when using fscore

File "/home/lib/python3.7/site-packages/segmentation_models_pytorch/utils/functions.py", line 69, in f_score
tp = torch.sum(gt * pr) RuntimeError: expected backend CUDA and dtype Double but got backend CUDA and dtype Float

multiclass

If I have four class,

class DiceLoss(nn.Module):
    __name__ = 'dice_loss'

    def __init__(self, eps=1e-7, activation='softmax2d'):
        super().__init__()
        self.activation = activation
        self.eps = eps

    def forward(self, y_pr, y_gt):
        return 1 - F.f_score(y_pr, y_gt, beta=1., eps=self.eps, threshold=None, activation=self.activation)

activation='softmax2d' should be set 'softmax2d' by myself?

Feature request: preprocess_input as a dictionary

Now we have a way to get a preprocess_input function

In [2]: preprocess_input = get_preprocessing_fn('resnet18', pretrained='imagenet')                                                                                                                                 

In [3]: preprocess_input                                                                                                                                                                                           
Out[3]: functools.partial(<function preprocess_input at 0x7efc1b644400>, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], input_space='RGB', input_range=[0, 1])

I would like to be able, for a given encoder to get a dictionary with parameters:

mean, std, input_space, input_range

[Ask for advice] Modify models by adding skip connections

Hi,

I need to modify models by adding skip connections between encoder layers and decoder layers like this

x = input  
x = self.layer0[-1](x)  
x = x + input  
x1 = self.layer1(x)  
x1 = x1 + x  
x2 = self.layer2(x1)  
x2 = x2 + x1  
x3 = self.layer3(x2)  
x3 = x3 + x2  
x4 = self.layer4(x3)  
x4 = x4 + x3  

I've tried implementing this and encountered the problem which dimensions after and before passing layers are unequal and the values and can't be added.

Is it possible to implement this? and How to implement it?

Thank you.

How to handle the multispectral image (NOT RGB image)?

I have multispectral image dataset (each image has eight channels) and want to feed them into the pretrained unet. In this case, how to modify the network so that the initialization of the first layer from the pretrained weight can be ignored. Otherwise, there will have error for the first layer's initialization due to the different input size of two models. Thanks.

how to transfer a standard pretrained resnet model to 4-channel in your code?

In Remote sensing, the image usually has more than three channels. For example, the image has NIR ,R ,G and B. I want to leveraged on the pretrained weights for a standard ResNet50 and transferred them to a 4-channel input version by copying RGB weights + the NIR weight as equivalent to the red channel.I have solve this in my code, but I have no idea about this. How to solve it in your code?Thank you!
Here is my solution in my code:
`model = models.resnet50(pretrained=True)
weight = model.conv1.weight.clone()
model.conv1 = nn.Conv2d(4, 64, kernel_size=7, stride=2, padding=3, bias=False)
with torch.no_grad():
model.conv1.weight[:, :3] = weight
model.conv1.weight[:, 3] = model.conv1.weight[:, 0]

x = torch.randn(10, 4, 224, 224)
output = model(x)`

pr_mask is not correct although the accuracy on CamVid test set is very high

I used se_resnext50_32x4d encoder and Unet decoder to train the segmentation model by following the CamVid tutorial on the webpage

However, the pr_mask is not correct although the accuracy on CamVid test set is very high(iou - 0.7417οΌ‰

valid: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 233/233 [00:10<00:00, 21.86it/s, bce_dice_loss - 0.3587, iou - 0.7417, f-score - 0.8195]

image

The trained model and codes can be found here

https://drive.google.com/drive/folders/0B6X3_r_lRbVUODRiM2RhMWMtNDk4NC00NmM1LWEyODEtMTdjNDI5MWJiMmQ4

image shape

Hello,my image has 4 channels ,what should I do to use this model ?

how can i unfreeze the layers of vgg16/vgg11 encoder used with unet decoder?

how can i unfreeze the layers of vgg16? i see your solution of this problem in segmentation model keras repository but not here for pytorch layer.trainable doesn't work here,any example please? and how many layers of vgg16 can be unfreezed while training with unet decoder for segmentation task? thanks a ton in advance

invalid hash value

ENCODER = 'se_resnext50_32x4d'
ENCODER_WEIGHTS = 'imagenet'
DEVICE = 'cuda'
RuntimeError: invalid hash value (expected "a260b3a4", got "dc315dde03a64a11145b0aa4c61a29403a7b709376bcba910c851f0115d81a04")

No module named 'segmentation_models_pytorch.common.blocks'

Hi,

I'm working in a internet restricted system. I've installed segmentation_models.pytorch using source code using pip install ..
Now when I try to import it, I get following error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-b9e13fa886e0> in <module>
----> 1 import segmentation_models_pytorch as smp

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/__init__.py in <module>
----> 1 from .unet import Unet
      2 from .linknet import Linknet
      3 from .fpn import FPN
      4 from .pspnet import PSPNet
      5 

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/__init__.py in <module>
----> 1 from .model import Unet

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/model.py in <module>
----> 1 from .decoder import UnetDecoder
      2 from ..base import EncoderDecoder
      3 from ..encoders import get_encoder
      4 
      5 

/opt/conda/lib/python3.6/site-packages/segmentation_models_pytorch/unet/decoder.py in <module>
      3 import torch.nn.functional as F
      4 
----> 5 from ..common.blocks import Conv2dReLU
      6 from ..base.model import Model
      7 

ModuleNotFoundError: No module named 'segmentation_models_pytorch.common.blocks'


Any ideas how this error can be solved?

How to PSPNET train with single channels input?

First of all, amazing work in both of your segmentation libraries. It saved me a lot of time.
I want to train PSPNET using single channel grayscale image but not able to figure out how to do it. In your keras documentation you have already mentioned the same. It will be really helpful if you could suggest the same here.
Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.