GithubHelp home page GithubHelp logo

ternaus / ternausnet Goto Github PK

View Code? Open in Web Editor NEW
1.0K 31.0 249.0 83.87 MB

UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

Home Page: https://arxiv.org/abs/1801.05746

License: MIT License

Python 100.00%
pytorch image-segmentation

ternausnet's Introduction

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

By Vladimir Iglovikov and Alexey Shvets

Introduction

TernausNet is a modification of the celebrated UNet architecture that is widely used for binary Image Segmentation. For more details, please refer to our arXiv paper.

UNet11

loss_curve

Pre-trained encoder speeds up convergence even on the datasets with a different semantic features. Above curve shows validation Jaccard Index (IOU) as a function of epochs for Aerial Imagery

This architecture was a part of the winning solutiuon (1st out of 735 teams) in the Carvana Image Masking Challenge.

Installation

pip install ternausnet

Citing TernausNet

Please cite TernausNet in your publications if it helps your research:

@ARTICLE{arXiv:1801.05746,
         author = {V. Iglovikov and A. Shvets},
          title = {TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation},
        journal = {ArXiv e-prints},
         eprint = {1801.05746},
           year = 2018
        }

Example of the train and test pipeline

https://github.com/ternaus/robot-surgery-segmentation

ternausnet's People

Contributors

gothos-folly avatar pokidyshev avatar shvetsiya avatar ternaus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ternausnet's Issues

'from unet_models import unet11' reports error

Hello, authors,
Thanks to your sharing code and paper. I read 'Example.ipynb' then implement related codes. When I type 'from unet_models import unet11', I get an error
'File "unet_models.py", line 16
def init(self, in_: int, out: int):
SyntaxError: invalid syntax'

How can I fix it?
My Pytorch version is 0.1.12.

About the unsampling

Thank you for your effort.
Have you compared the result between bilinear interpolation and deconvolution when used in decoder? Which one would be better?

conv5 pooling

Hi!
I was wondering what was the motivation behind performing pooling after already downsampled layer here:

center = self.center(self.pool(conv5))

Did you test if it was improving score compared to using downsampling conv layer one more time or simply removing this bottom layer and having conv5 in bottleneck?

Change VGG11_2D to VGG11_3D

I followed this code, everything good for 2D model.
Now, I want to use it for 3D model. I changed 2D to 3D, example:
self.pool = nn.MaxPool2d(2,2) => self.pool = nn.MaxPool3d(2,2)
.......
but, at the line 49 ( the picture); : models.vgg11(pretrained=pretrained).features is 2D model. I want to use vgg11 for 3D. How do I do? Thank you
image

p/s: I visualizered vgg11; it is 2D model
image

training example

Hello,
Thank you for this upload and congrats on winning the Kaggle challenge. Is it possible for you to provide an example of how to train the network? I can't seem to put everything just right for it to run.

sigmoid output?

Your paper indicates a final sigmoid output layer, but the models in model.py do not have such a layer and the outputs are not in the range [0,1]. Do I just add a torch.sigmoid in the forward call? In a somewhat unrelated matter, it doesn't seem that your loss function is available. I'm new to machine learning/pytorch/computers and any help would be appreciated. Thank you.

About num_classes

Hello, @ternaus!
Thanks for your code, it's really cool, but meaning of num_classes, for instance, in Unet16, isn't clear.

If I have car and background then I consider that image has 2 num_classes: 0 - background and 1 - car. But according to your code, in my example, you suppose that image has only one class. Why?

Thanks in advance.

Clarification about loss?

As I can see in paper you use joint loss function L = H-log(J), I wonder why log is used? is it because H and J have different possible range?

How to make final prediction and tuning the model?

Thank you very much for developing this model!
I am quite new to image segmentation. So, I still learning. The question I put here might be a very silly, and it is definitely not any issues of your codes.

I am using your pretrained VGG11 model for Kaggle AirBus competition. The output class is binary. The first problem is that during training the loss score continued to decrease, however the Jaccard score do not change at all.

Epoch 1, lr 0.01:   0%|          | 0/57424 [00:00<?, ?it/s]
0.01
Epoch 1, lr 0.01: 100%|█████████▉| 57422/57424 [1:49:41<00:00,  8.33it/s, loss=0.00074]
Epoch 2, lr 0.01:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00063, jaccard: 0.37004
0.01
Epoch 2, lr 0.01: 100%|█████████▉| 57422/57424 [1:50:11<00:00,  8.14it/s, loss=0.00350]
Epoch 3, lr 0.01:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.63987, jaccard: 0.37004
0.01
Epoch 3, lr 0.01: 100%|█████████▉| 57422/57424 [1:49:52<00:00,  8.04it/s, loss=0.00102]
Epoch 4, lr 0.01:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00081, jaccard: 0.37004
0.01
Epoch 4, lr 0.01: 100%|█████████▉| 57422/57424 [1:49:59<00:00,  8.02it/s, loss=0.00036]
Epoch 5, lr 0.01:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00043, jaccard: 0.37004
0.01
Epoch 5, lr 0.01: 100%|█████████▉| 57422/57424 [1:49:53<00:00,  8.01it/s, loss=0.00035]
Epoch 6, lr 0.001:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00039, jaccard: 0.37004
0.001
Epoch 6, lr 0.001: 100%|█████████▉| 57422/57424 [1:49:33<00:00,  7.97it/s, loss=0.00038]
Epoch 7, lr 0.001:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00039, jaccard: 0.37004
0.001
Epoch 7, lr 0.001: 100%|█████████▉| 57422/57424 [1:49:32<00:00,  8.26it/s, loss=0.00051]
Epoch 8, lr 0.001:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00039, jaccard: 0.37004
0.001
Epoch 8, lr 0.001: 100%|█████████▉| 57422/57424 [1:49:33<00:00,  8.15it/s, loss=0.00085]
Epoch 9, lr 0.001:   0%|          | 0/57424 [00:00<?, ?it/s]
Valid loss: 0.00039, jaccard: 0.37004
0.001
Epoch 9, lr 0.001:  92%|█████████▏| 52952/57424 [1:40:36<08:59,  8.29it/s, loss=0.00052]

My next question is how to make final prediction? I check your paper. In the paper, you claim that after applying sigmoid function to output, you just pick a "0.3" threshold. So, if I want to do my own problem, I just do the same way, correct? also, I tried with my output. I tried with different numbers, here is an example output I got with 0.509 threshold. It is clearly detecting something. However, the predicted ship area is not very continuous, unlike the one in your paper. Do you know why? or how to deal with it? how to better select a threshold?
pred_1

Any suggestion for my next step?

Thank you!

train test

Hello,can you provide your train and test files?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.