GithubHelp home page GithubHelp logo

bisenet's Introduction

BiSeNet

BiSeNet based on pytorch 0.4.1 and python 3.6

Dataset

Download CamVid dataset from Google Drive or Baidu Yun(6xw4).

Pretrained model

Download best_dice_loss_miou_0.655.pth in Google Drive or in Baidu Yun(6y3e) and put it in ./checkpoints

Demo

python demo.py

Result

Original GT Predict

Train

python train.py

Use tensorboard to see the real-time loss and accuracy

loss on train

pixel precision on val

miou on val

Test

python test.py

Result

class Bicyclist Building Car Pole Fence Pedestrian Road Sidewalk SignSymbol Sky Tree miou
iou 0.61 0.80 0.86 0.35 0.37 0.59 0.88 0.81 0.28 0.91 0.73 0.655

This time I train the model with dice loss and get better result than cross entropy loss. I did not use lots special training strategy, you can get much better result than this repo if using task-specific strategy.
This repo is mainly for proving the effeciveness of the model.
I also tried some simplified version of bisenet but it seems does not preform very well in CamVid dataset.

Speed

Method 640×320 1280×720 1920×1080
Paper 129.4 47.9 23
This Repo 126.8 53.7 23.6

This shows the speed comparison between paper and my implementation.

  1. The number in first row means input image resolution.
  2. The number in second and third row means FPS.
  3. The result is based on resnet-18.

Future work

  • Finish real-time segmentation with camera or pre-load video

Reference

bisenet's People

Contributors

ooooverflow avatar seanxyz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bisenet's Issues

The performance on CityScape dataset

1.Have you trained your model on CItyscape dataset? When I change the dataset from Camvid to Cityscape, run the train.py, i get an error like this:
epoch 0, lr 0.001000: 0%| | 0/744 [00:00<?, ?it/s]Traceback (most recent call last): File "train.py", line 206, in <module> main(params) File "train.py", line 187, in main train(args, model, optimizer, dataloader_train, dataloader_val, csv_path) File "train.py", line 86, in train for i,(data, label) in enumerate(dataloader_train): File "/home/sunjingchen/miniconda3/envs/pytorch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 336, in __next__ return self._process_next_batch(batch) File "/home/sunjingchen/miniconda3/envs/pytorch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch raise batch.exc_type(batch.exc_msg) ValueError: Traceback (most recent call last): File "/home/sunjingchen/miniconda3/envs/pytorch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/sunjingchen/miniconda3/envs/pytorch041/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in <listcomp> samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/sunjingchen/BiSeNet-City/dataset/CamVid.py", line 81, in __getitem__ label = one_hot_it(label, self.label_info).astype(np.uint8) File "/home/sunjingchen/BiSeNet-City/utils.py", line 44, in one_hot_it equality = np.equal(label, color) ValueError: operands could not be broadcast together with shapes (640,640,4) (3,)

Could you tell me what's the reason of this error? I think maybe is the format of Camvid dataset is different from Cityscape dataset.

2.Have you tested the speed of BiSeNet on Camvid, Is it near to105FPS?

Thank you very much!

error about pretrained model

when i use the model epoch_295.pth in GoogleDrive and put it in ./checkpoints ,run eval.py, it comes a error like this:

load model from /root/data/Seg_Pytorch/BiSeNet-master/checkpoints/epoch_295.pth ...
Traceback (most recent call last):
File "eval.py", line 97, in
main(params)
File "eval.py", line 82, in main
model.module.load_state_dict(torch.load(args.checkpoint_path))
File "/root/miniconda3/envs/pytorch10/lib/python3.6/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BiSeNet:
Missing key(s) in state_dict: "supervision1.weight", "supervision1.bias", "supervision2.weight", "supervision2.bias".
Unexpected key(s) in state_dict: "saptial_path.convblock1.bn.num_batches_tracked", "saptial_path.convblock2.bn.num_batches_tracked", "saptial_path.convblock3.bn.num_batches_tracked", "context_path.features.bn1.num_batches_tracked", "context_path.features.layer1.0.bn1.num_batches_tracked", "context_path.features.layer1.0.bn2.num_batches_tracked", "context_path.features.layer1.0.bn3.num_batches_tracked", "context_path.features.layer1.0.downsample.1.num_batches_tracked",

You said the net structure has been modified, so the pretrained model does not match the current model, so do you have any other pre-trained mode recommend for this network?

On COCO dataset

Do you have any pretrained weights for COCO dataset. If not do you have any train.py or script to load the COCO dataset. Appreciate any help.

Poor performance

Hey, I git clone the repo, and download CamVid, change the path, and train for 300 epochs, but got poor performance, 0.182, not 94.1 or 93.2 reported in README.md. Any suggestion?

Here is the val curve.
image

训练时val过程很慢

我在训练过程中发现val过程很慢,比训练一个epoch都慢,请问有没有什么优化的方法?
由于TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.错误,我将train.py 中第32和38行改为了predict = predict.data.cpu().numpy()和label = label.data.cpu().numpy(),这是不是有影响?

Right Dice Loss?

class DiceLoss(nn.Module): def __init__(self): super().__init__() self.epsilon = 1e-5

def forward(self, output, target):
    # print(output.shape)
    # print(target.shape)

    assert output.size() == target.size(), "'input' and 'target' must have the same shape"
    # 在classes上做softmax
    output = F.softmax(output, dim=1)
    # 打平tensor
    output = flatten(output) # [num_classes,B*H*W]
    target = flatten(target) # [num_classes,B*H*W]
    # intersect = (output * target).sum(-1).sum() + self.epsilon
    # denominator = ((output + target).sum(-1)).sum() + self.epsilon

    intersect = (output * target).sum(-1)
    denominator = (output + target).sum(-1)
    # dice --(0-0.5)
    dice = intersect / denominator
    dice = torch.mean(dice)
    # 1-dice (0.5,1)???
    return 1 - dice
    # return 1 - 2. * intersect / denominator

double the intersection over union?

error while creating model

I am trying to run this code, but I get the following error:

 cx = torch.cat((cx1, cx2), dim=1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 128 and 64 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71

Any idea, why is it so?

Error: FileNotFoundError: [Errno 2] No such file or directory: '/path/to/ckpt/'

BiSeNet$ python demo.py

Traceback (most recent call last):
File "demo.py", line 81, in
main(params)
File "demo.py", line 60, in main
model.module.load_state_dict(torch.load(args.checkpoint_path))
File "/home/keshi/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 381, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/ckpt/'

(base) keshi@keshi-Blade:~/BiSeNet$ python train.py
Traceback (most recent call last):
File "train.py", line 209, in
main(params)
File "train.py", line 148, in main
loss=args.loss, mode='train')
File "/home/keshi/BiSeNet/dataset/CamVid.py", line 43, in init
self.label_info = get_label_info(csv_path)
File "/home/keshi/BiSeNet/utils.py", line 31, in get_label_info
ann = pd.read_csv(csv_path)
File "/home/keshi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/keshi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/keshi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/home/keshi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/keshi/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1917, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 689, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/path/to/CamVid/class_dict.csv' does not exist: b'/path/to/CamVid/class_dict.csv'

Could you tell me why this error will happen? How could I solve this problem?

train wrong

when i train my data , it happend as followed:

os@os-l3:/disk3t-2/zym/BiSeNet-PyTorch$ python train.py
epoch 0, lr 0.001000: 0%| | 0/4963 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 157, in
main(params)
File "train.py", line 141, in main
train(args, model, optimizer, dataloader_train, dataloader_val, csv_path)
File "train.py", line 56, in train
output = model(data)
File "/home/os/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/os/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/os/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/disk3t-2/zym/BiSeNet-PyTorch/model/build_BiSeNet.py", line 97, in forward
cx1 = self.attention_refinement_module1(cx1)
File "/home/os/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/disk3t-2/zym/BiSeNet-PyTorch/model/build_BiSeNet.py", line 40, in forward
assert self.in_channels == x.size(1), 'in_channels and out_channels should all be {}'.format(x.size(1))
AssertionError: in_channels and out_channels should all be 256

Some suggestion about your code

Firstly,you should give the version of pytorch and python,which is enjoyable;
Secondly,what's the speed of your model?Is it near to105FPS?

I think there is a trouble code

Please see the function that come from utils.py:
def one_hot_it_v11_dice(label, label_info):
semantic_map = []
void = np.zeros(label.shape[:2])
for index, info in enumerate(label_info):
color = label_info[info][:3]
class_11 = label_info[info][3]
if class_11 == 1:
equality = np.equal(label, color)
class_map = np.all(equality, axis=-1)
semantic_map.append(class_map)
else:
equality = np.equal(label, color)
class_map = np.all(equality, axis=-1)
void[class_map] = 1
semantic_map.append(void)
semantic_map = np.stack(semantic_map, axis=-1).astype(np.float)
return semantic_map
The variable "semantci_map" is a python list, but in the function ,the list only have twice append operations so that in compute loss pahse the error of output and target have different shape happend, because output shape is [batch, num class,w,h], but target shape is [batch ,2,w,h].
I can't guarantee I'm absolutely right.

This Code only supports pytorch==0.4.1

Fisrt I met the same error with #1 using pytorch==0.4.0, and I met other error using pytorch==1.0.0
When I change the version into 0.4.1, it solved.
But when I start training, it can work well with resnet101. When I change the build_contextpath into "resnet18", I met the same error with #3

I think this version is not very stable.

When I first trained, the training data(MIOU) was normal.
but second, miou has been fixed at around 0.12, I thought for a long time but did't solve this problem.
So, I want to ask if you have encountered this problem.

How to run demo on video?

Hey, since it's a real-time semantic segmentation. I wonder how could I run the demo on video? Is there any example code or tutorial?

Wrong label for Seq05VD_f02610_L.png

Hi, I convert label from RGB mode to 'P' mode based on class_dict.csv, and found there are some points/coord with wrong label, aka its color is out of color defined in class_dict.csv. Here is how I convert RGB mode to 'P' mode and check label:

from PIL import Image
from utils import get_label_info

labelinfo = get_label_info(csvfile)
palette = []
for key in labelinfo:
    for item in labelinfo[key]:
        palette.append(item)

ref_image = Image.new(mode='P', size=(1, 1))
ref_image.putpalette(palette)

img_rgb = Image.open('rgb_mode.png')
img_p = img_rgb.quantize(palette=ref_image)

img_p_wrong = img_p[img > 31]
img_rgb_wrong = img_rgb[img > 31]

I'm not sure how to deal with this image, currently I simply remove it out of my training/val/testing. After convert RGB mode to 'P' mode, I don't need to use one hot encoding for label anymore. It seems that the one hot encoding & decoding slow down training speed.

And, the code you calculating accuracy seems weird, https://github.com/ooooverflow/BiSeNet/blob/master/utils.py#L103, pred & label are in shape of 1xCxHxW, where C is the channels, which is 3 in this case. pred[:, :, h, w] == label[:, :, h, w] means one pixel prediction right, not three.

Start counting epochs from this number wrong

train.py里面并没有实现epoch_start_i函数判断的模块,所以当引用这个参数会报错。
因此目前来说,如果想接着上次迭代的epoch来进行训练的话,只能通过pre-train model 来操作

train val accuracy is not as high as mentioned. Plus res101 accuracy curve is not stable

i train with context_path: resnet101 and resnet 18.
first question: both the validation accuracy hardly reach 0.9, Mostly stop as 0.88-0.89.
second question: while the resnet101 training, the validation accuracy will fluctuate a lot after epoch 100, and will drop a lot after about epoch 180+.
could you please share your training parameter like lr,batchsize,GPU num,crop_height,crop_width and some detailed trick?

loss function

The loss function in the original paper is composed of two parts. But you only use the output of feature fusion model to calculate the loss. And the loss they use is cross entropy, here you use is binary cross entropy. Is there any reason to these changes? Thanks!

error

Traceback (most recent call last):
File "train.py", line 209, in
main(params)
File "train.py", line 148, in main
loss=args.loss, mode='train')
File "/home/lr/代码/BiSeNet-master/dataset/CamVid.py", line 43, in init
self.label_info = get_label_info(csv_path)
File "/home/lr/代码/BiSeNet-master/utils.py", line 31, in get_label_info
ann = pd.read_csv(csv_path)
File "/home/lr/anaconda3/envs/gpu/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/lr/anaconda3/envs/gpu/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/lr/anaconda3/envs/gpu/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/home/lr/anaconda3/envs/gpu/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/lr/anaconda3/envs/gpu/lib/python3.7/site-packages/pandas/io/parsers.py", line 1906, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 380, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 687, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/path/to/CamVid/class_dict.csv' does not exist: b'/path/to/CamVid/class_dict.csv'
Hi, I have a question,Is this error due to lack of dataset?

On the problem of Spatial Path

In 3.1 Spatial Path, there is a sentence "Based on this observation, we propose a Spatial Path to preserve the spatial size of the original input image and encode affluent spatial information. ". But in 3.1 Spatial Path, this path extracts the output feature maps that is 1/8 of the original image. What does it mean to preserve the spatial size of the original input image? Why does the paper say to preserve the spatial size of the original input image?
I'm looking forward to your reply. I'm very confused about this question. Thank you very much!

FileNotFoundError: [Errno 2] No such file or directory: '/PI_Blackfriars_Sys_1_4/Room_34_SetTempHeat.csv'


FileNotFoundError Traceback (most recent call last)
in
5 #df
6 for site_name in df['SiteName'].unique():
----> 7 df[df['SiteName'] == site_name].to_csv('{}.csv'.format(site_name))

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
3018 doublequote=doublequote,
3019 escapechar=escapechar, decimal=decimal)
-> 3020 formatter.save()
3021
3022 if path_or_buf is None:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\formats\csvs.py in save(self)
155 f, handles = _get_handle(self.path_or_buf, self.mode,
156 encoding=self.encoding,
--> 157 compression=self.compression)
158 close = True
159

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\common.py in _get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text)
422 elif encoding:
423 # Python 3 and encoding
--> 424 f = open(path_or_buf, mode, encoding=encoding, newline="")
425 elif is_text:
426 # Python 3 and no explicit encoding

FileNotFoundError: [Errno 2] No such file or directory: '/PI_Blackfriars_Sys_1_4/Room_34_SetTempHeat.csv'

can you please tell me why this is happening and how to solve this problem

Please add mIoU calculation

We all know that mIoU is the main metric in Semantic Segmentation.
To compare with other models easily, please add it. Thank you.

AttributeError: 'BiSeNet' object has no attribute 'module'

load model from ./checkpoints/epoch_295.pth ...
Traceback (most recent call last):
File "c:\Users\Administrator\Desktop\BiSeNet-master\BiSeNet-master\demo.py", line 80, in
main(params)
File "c:\Users\Administrator\Desktop\BiSeNet-master\BiSeNet-master\demo.py", line 60, in main
model.module.load_state_dict(torch.load(args.checkpoint_path))
File "C:\Program Files\Python\lib\site-packages\torch\nn\modules\module.py", line 518, in getattr
type(self).name, name))
AttributeError: 'BiSeNet' object has no attribute 'module'
PS C:\Users\Administrator\Desktop\BiSeNet-master\BiSeNet-master>
Where do I put the dataset ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.