GithubHelp home page GithubHelp logo

oandrienko / fast-semantic-segmentation Goto Github PK

View Code? Open in Web Editor NEW
219.0 9.0 41.0 2.22 MB

ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation

Python 99.58% Makefile 0.42%
icnet pspnet tensorflow semantic-segmentation

fast-semantic-segmentation's Introduction

Real-Time Semantic Segmentation in TensorFlow

Perform pixel-wise semantic segmentation on high-resolution images in real-time with Image Cascade Network (ICNet), the highly optimized version of the state-of-the-art Pyramid Scene Parsing Network (PSPNet). This project implements ICNet and PSPNet50 in Tensorflow with training support for Cityscapes.

Download pre-trained ICNet and PSPNet50 models here

Deploy ICNet and preform inference at over 30fps on NVIDIA Titan Xp.

This implementation is based off of the original ICNet paper proposed by Hengshuang Zhao titled ICNet for Real-Time Semantic Segmentation on High-Resolution Images. Some ideas were also taken from their previous PSPNet paper, Pyramid Scene Parsing Network. The network compression implemented is based on the paper Pruning Filters for Efficient ConvNets.

Release information

October 14, 2018

An ICNet model trained in August, 2018 has been released as a pre-trained model in the Model Zoo. All the models were trained without coarse labels and are evaluated on the validation set.

September 22, 2018

The baseline PSPNet50 pre-trained model files have been released publically in the Model Zoo. The accuracy of the model surpases that referenced in the ICNet paper.

August 12, 2018

Initial release. Project includes scripts for training ICNet, evaluating ICNet and compressing ICNet from ResNet50 weights. Also includes scripts for training PSPNet and evaluating PSPNet as a baseline.

Documentation

Model Depot Inference Tutorials

Overview

ICNet model in Tensorboard.

Training ICNet from Classification Weights

This project has implemented the ICNet training process, allowing you to train your own model directly from ResNet50 weights as is done in the original work. Other available implementations simply convert the Caffe model to Tensorflow, only allowing for fine-tuning from weights trained on Cityscapes.

By training ICNet on weights initialized from ImageNet, you have more flexibility in the transfer learning process. Read more about setting up this process can be found here. For training ICNet, follow the guide here.

ICNet Network Compression

In order to achieve real-time speeds, ICNet uses a form of network compression called filter pruning. This drastically reduces the complexity of the model by removing filters from convolutional layers in the network. This project has also implemented this ICNet compression process directly in Tensorflow.

The compression is working, however which "compression scheme" to use is still somewhat ambiguous when reading the original ICNet paper. This is still a work in progress.

PSPNet Baseline Implementation

In order to also reproduce the baselines used in the original ICNet paper, you will also find implementations and pre-trained models for PSPNet50. Since ICNet can be thought of as a modified PSPNet, it can be useful for comparison purposes.

Informtion on training or using the baseline PSPNet50 model can be found here.

Maintainers

If you found the project, documentation and the provided pretrained models useful in your work, consider citing it with

@misc{fastsemseg2018,
  author={Andrienko, Oles},
  title={Fast Semantic Segmentation},
  howpublished={\url{https://github.com/oandrienko/fast-semantic-segmentation}},
  year={2018}
}

Related Work

This project and some of the documentation was based on the Tensorflow Object Detection API. It was the initial inspiration for this project. The third_party directory of this project contains files from OpenAI's Gradient Checkpointing project by Tim Salimans and Yaroslav Bulatov. The helper modules found in third_party/model_deploy.py are from the Tensorflow Slim project. Finally, another open source ICNet implementation which converts the original Caffe network weights to Tensorflow was used as a reference. Find all these projects below:

Thanks

  • This project could not have happened without the advice (and GPU access) given by Professor Steven Waslander and Ali Harakeh from the Waterloo Autonomous Vehicles Lab (now the Toronto Robotics and Artificial Intelligence Lab).

fast-semantic-segmentation's People

Contributors

oandrienko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast-semantic-segmentation's Issues

Pre-train model for evaluation?

Dear oandrienko,

Thanks for your code ,

  1. Could you upload your trained model for evaluation?
  2. Inference.py:100, output_channels is not defined in code. If setting output_channels=1, the gray result was showed in the result. If setting output_channels!=1. raise TypeError("Cannot handle this data type") on line 102 "im = Image.fromarray(predictions)"

Thanks

ImportError

Thanks for your sharing.

ImportError: cannot import name hyperparams_pb2

It couldn't find such file inside protos folder, so what should I do?

Train on 2x8GB GPU Like 1080

hey guys
I don't have access to Titan XP, but I have access to 1080 GPU on the same machine, is there any way I could train the model?

Meet Segmentation fault (core dumped) while training

I was training both PSPNet and ICNet using the method provided from README, But Segmentation fault (core dumped) occured on both Net while training and no other logs. Could anybody please tell me how to train the network?

Exported frozen graph of PSPNet50

Hi, thank you for your work. I'm trying to load your pre-trained weights to another graph as initialization. I found that in the frozen graph there's not gamma or beta variable under BatchNorm, but only one constant named Conv2D_bn_offset. Is this because the convolution and the batch normalization are merged in the frozen graph? Thank you in advance.

Training logs and multi-gpu

Hi! I am looking at your code, and wanted to train ICNet as you describe it in you nice documentation. I have two questions regarding the PSPNet fine tuning part where you use the train_mem_saving.py

  • I was expecting to have some feedback along the way when its training, but besides 2 times a log that a checkpoint was saved, nothing is printed, is it normal ?
  • I also tried to use it with two gpus. I tried to change the num_clones flag but I get an KeyError that comes from the create_clones() in the trainer.py. Is there anything else I should do to use several GPUs to train ?

Thanks a lot :)

model compression

Nice work!
The Pruning Filters for Efficient ConvNets that you implemented seems very difficult, it includes a lot of funtions..... I felt sad.......

name = graph_utils.node_name_from_input(

I use meta of checkpoint to import graph, and not use pb file. So how should I change this line? I know you try to get the input of the output code, but if I use checkpoint file, it seems the tensor has no input function.......

some questions about training

Thanks for your project. I have some questions.

  1. Where is the 953 file? This file is mentioned in Icnet.md.
    image
    I always get some errors in the first stage of training without it.
  2. I haven't read the code in detail. If I just use 5 classes of cityscapes to train. Do I only need to change the variable classes in config to 5?
    3.I tried train from scratch using hellochick/ICNet-tensorflow(https://github.com/hellochick/ICNet-tensorflow) and the result was bad. I found your answers there. So if i train 5-classes-segmentation from 0818_pspnet_1.0_713_resnet_v1,will I get a good result?
    Thank you and look forward to your reply.

How to increase the frame rate

When I run the project,the fps is 0.14.I don't know how to solve this problem.Thanks!
model: 0818_pspnet_1.0_713_resnet_v1
tensorflow-gpu:1.14.0

Consult for help

Hello,

In the section of Dataset formatting, there is this line:

must have $CITYSCAPES_ROOT defined

How to define the root. Every time I try running i get an error ERROR: Did not find any files. Please consult the README.

Could not find pipeline.config in Pre-trained ICNet and PSPNet Models archives

Hi,
I am trying to use the pretrained models provided in the link for validation
https://github.com/oandrienko/fast-semantic-segmentation/blob/master/docs/model_zoo.md#validating-pre-trained-models

the validation scripts eval.py and inference.py expects parameter --config_path=0818_icnet_0.5_1025_resnet_v1/pipeline.config.

but 0818_icnet_0.5_1025_resnet_v1 dont have "pipeline.config" in the downloaded tar.gz file.

please can you check it.

Thanks & Regards
Lakshminarayana reddy

License info

Can u provide more info on license "Example MIT/Apache.... license "

MORE INFO ON LICENSE

Can u provide more info on license "Example providing MIT license ", mentioning openness for reuse even commercially, like usage of this trained model for inference in a Software without restriction, limitation to use, copy, modify, merge, publish, distribute etc etc.. current license by citation is not clear

AssertionError: `output_dir` missing.

` christophe@ubuntubox:~/Desktop/fast-semantic-segmentation$ python create_cityscapes_tfrecord.py -....

Traceback (most recent call last):
  File "create_cityscapes_tfrecord.py", line 140, in <module>
    tf.app.run()
  File "/home/christophe/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "create_cityscapes_tfrecord.py", line 108, in main
    assert FLAGS.output_dir, '`output_dir` missing.'
AssertionError: `output_dir` missing.`

Where am I going wrong with this? AssertionError: `output_dir` missing.
Many thanks. 

No module named 'graph_utils'

oandrienko:

Thanks for your nice work on this.

I am following the instructions on "Training ICNet with Tensorflow", in Stage 2 - Compression and Retraining, I got and error as below while trying to run script python3 compress.py...

File "/home/work/×××/fast-semantic-segmentation/libs/filter_pruner.py", line 24, in <module> from graph_utils import GraphTraversalState ModuleNotFoundError: No module named 'graph_utils'

I tried to comment out LINE24, and re-run the script, and it gave me following errors...

Traceback (most recent call last):
  File "compress.py", line 103, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 124, in run
    _sys.exit(main(argv))
  File "compress.py", line 96, in main
    compressor.compress(FLAGS.input_checkpoint)
  File "/home/work/***/fast-semantic-segmentation/libs/filter_pruner.py", line 378, in compress
    self.neighbors = self._create_adjacency_list(self.output_node)
  File "/home/work/***/fast-semantic-segmentation/libs/filter_pruner.py", line 110, in _create_adjacency_list
    output_node = self.nodes_map[output_node_name]
KeyError: 'Predictions/postrain/Conv2D'

I am using python3.6.7 and tensorflow1.5.0 with CUDA9.1.

Any suggestion on how to fix this kind of issues? Thanks in advance.

For c++ pb

Is the input node name "inputs" ?
output nodename "Predictions/Conv/Conv2D" ?

And what size and type is the input and output tensor?

Different Input dimension

How to change input dimension? I'm trying to run it on a different dataset with input size: 720x960

Help: Convert ICNET_0.5 to onnx file

Hello,

I want to transform this TF model: ICNET_0.5 to onnx and I followed this example: ConvertingSSDMobilenetToONNX

I understood if I just want to inference I should use the frozen graph (frozen_inference_graph.pb) so I changed the name to savel_model.pb (it seems that tf2onnx does not recognize other name) and run the following with this error:

C:\Users\esarojp\Desktop\newmodel\0818_icnet_0.5_1025_resnet_v1.tar> python -m tf2onnx.convert --opset 10 --fold_const --saved-model .\0818_icnet_0.5_1025_resnet_v1\saved_model\ --output MODEL.onnx

 - WARNING - From C:\Users\esarojp\AppData\Local\Continuum\anaconda3\lib\site-packages\tf2onnx\verbose_logging.py:72: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

Traceback (most recent call last):
  File "C:\Users\esarojp\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\esarojp\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\esarojp\AppData\Local\Continuum\anaconda3\lib\site-packages\tf2onnx\convert.py", line 161, in <module>
    main()
  File "C:\Users\esarojp\AppData\Local\Continuum\anaconda3\lib\site-packages\tf2onnx\convert.py", line 123, in main
    args.saved_model, args.inputs, args.outputs, args.signature_def)
  File "C:\Users\esarojp\AppData\Local\Continuum\anaconda3\lib\site-packages\tf2onnx\loader.py", line 103, in from_saved_model
    meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_path)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\saved_model\loader_impl.py", line 269, in load
    return loader.load(sess, tags, import_scope, **saver_kwargs)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\saved_model\loader_impl.py", line 422, in load
    **saver_kwargs)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\saved_model\loader_impl.py", line 349, in load_graph
    meta_graph_def = self.get_meta_graph_def_from_tags(tags)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\saved_model\loader_impl.py", line 327, in get_meta_graph_def_from_tags
    "\navailable_tags: " + str(available_tags))
RuntimeError: MetaGraphDef associated with tags 'serve' could not be found in SavedModel. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: `saved_model_cli`
available_tags: [set()]

and when I run:

C:\Users\esarojp\Desktop\newmodel\0818_icnet_0.5_1025_resnet_v1.tar> saved_model_cli show --dir .\0818_icnet_0.5_1025_resnet_v1\saved_model\ --tag_set serve  --signature_def serving_default
Traceback (most recent call last):
  File "C:\Users\esarojp\AppData\Local\Continuum\anaconda3\Scripts\saved_model_cli-script.py", line 10, in <module>
    sys.exit(main())
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\tools\saved_model_cli.py", line 909, in main
    args.func(args)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\tools\saved_model_cli.py", line 621, in show
    _show_inputs_outputs(args.dir, args.tag_set, args.signature_def)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\tools\saved_model_cli.py", line 133, in _show_inputs_outputs
    tag_set)
  File "C:\Users\esarojp\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\tools\saved_model_utils.py", line 120, in get_meta_graph_def
    ' could not be found in SavedModel')
RuntimeError: MetaGraphDef associated with tag-set serve could not be found in SavedModel

Any idea of what is wrong?

Training time

Hello,

I wonder, what's the training time, both for PSPNet baseline and ICNet. I am running PSPNet training at the moment, it looks like it takes ~3 days. Does that sound right?

Lower input size model

Is there a lower input size model available?
Will it help with faster inference if the image size is reduced?
How do I go about training on lower resolution images, what are the changes required in the model?

Cannot import name 'input_reader_pb2' from 'protos'

I am running model evaluation with this command

!python eval.py -- config_path ERFNet.config
-- train_dir /content/ERFNet
--eval_dir /content/ERFNet_eval
--verbose True # will log mIoU accuracy

and I receive this import error.
"Cannot import name 'input_reader_pb2' from 'protos"

Anyone knows how to fix?

Thanks in advance

Error during evaluation - used PASCAL VOC dataset

Thanks for your work.
I tried this source code on the PASCAL VOC dataset (converted it to .tfrecord format using this code https://github.com/tensorflow/models/tree/master/research/deeplab)

The training was successful and the loss at the end of 40000 steps was 1.32 around when I tried to evaluate it throws me errors kindly see the screenshot

I searched internet and looks there may be an issue with the version of tensorflow-gpu I am using help from you would be appreciated or if you can tell me which version of TensorFlow to use that is also fine. Tried TensorFlow 1.8 as well but that also did not work

ENVIRONMENT INFO
Ubuntu 18
Cuda 10.0
tensorflow-gpu=1.15.5

image

I cannot compress the model

Thanks for you great work.

I tried the training my own dataset with refering
https://github.com/oandrienko/fast-semantic-segmentation/blob/master/docs/icnet.md

And stage1 works fine. But I cannot compress the model at stage2.

python compress.py --prune_config configs/compression/icnet_resnet_v1_pruner_v2.prune_config --input_checkpoint stage2/model.ckpt --output_dir stage2_compress --compression_factor 0.5

And I got this error.

Traceback (most recent call last):
  File "compress.py", line 103, in <module>
    tf.app.run()
  File "/home/wataru/git_work/fast-semantic-segmentation/venv/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/wataru/git_work/fast-semantic-segmentation/venv/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/wataru/git_work/fast-semantic-segmentation/venv/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "compress.py", line 96, in main
    compressor.compress(FLAGS.input_checkpoint)
  File "/home/wataru/git_work/fast-semantic-segmentation/libs/filter_pruner.py", line 385, in compress
    self._create_pruner_specs_recursively(self.input_node)
  File "/home/wataru/git_work/fast-semantic-segmentation/libs/filter_pruner.py", line 374, in _create_pruner_specs_recursively
    self._create_pruner_specs_recursively(next_node)
  File "/home/wataru/git_work/fast-semantic-segmentation/libs/filter_pruner.py", line 351, in _create_pruner_specs_recursively
    curr_node_name)
  File "/home/wataru/git_work/fast-semantic-segmentation/libs/filter_pruner.py", line 323, in _get_following_bn_and_conv_names
    raise ValueError('Incompatable model file.')
ValueError: Incompatable model file.

I tried to know which node is bad by inserting print(next_node.op) to filter_pruner.py and this output is 'FusedBatchNormV3'

Do you have any idea workaround this.

Error while loading checkpoint when training

I have to train ICNet on Camvid dataset.
I initialize the network with classification weights of the ResNet with a single stage training.

I set .config file and train_mem_saving.py input arguments as specified in https://github.com/oandrienko/fast-semantic-segmentation/blob/master/docs/icnet.md.

When initializing the model from the checkpoint "tmp/resnet_v1_50.ckpt", I get this error:

"Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Key AuxOutput/biases not found in checkpoint
[[node save_1/RestoreV2 (defined at /content/libs/trainer.py:286) ]]".

It is strange since it sholdn't try to load variable 'AuxOutput/biases', since it does not belong to ResNet checkpoint.
I correctly set 'fine_tune_checkpoint_type' to 'classification' in .config file.

I am running the project on Google Colab which supports which supports tensorflow versions only greater than 1.13.1.
Has anyone successfully run the project with these versions of tf?

Thank you in advance

Key CascadeFeatureFusion/Conv/BatchNorm/beta not found in checkpoint

Thank you for spending all the time for us!

I have applied your suggestions, checked out master and renamed the nodes since I trained pspnet already.
At stage 1 of the training, when I run the export.py script, I have the following issue:
_### _NotFoundError (see above for traceback): Key CascadeFeatureFusion/Conv/BatchNorm/beta not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], device="/job:localhost/replica:0/task:0/device:CPU:0"](arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

How to execute the sample code?

as the sample link, https://modeldepot.io/oandrienko/icnet-for-fast-segmentation
I found there is an issue as below when I execute to the run_inference_for_single_image() to get the tensor.name and it can not be found. I tried many methods but it doesn't work.
Does anyone know how to convert normally?

KeyError: "The name 'inputs:0' refers to a Tensor which does not exist. The operation, 'inputs', does not exist in the graph."

Environment,
Cuda: 9.0
Cudnn: 7.0.5
Tensorflow: 1.8

Stage 2 - Compression and Retraining

Hi, I follow the Documentation step by step, from training PSPNet to re-training ICNet . Everything works fine until the last step, When i re-train ICNet after compress ICNet , it shows the problem as below.

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Assign requires shapes of both tensors to match. lhs shape= [1,1,256,3] rhs shape= [1,1,512,3]
[[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@CascadeFeatureFusion_0/AuxOutput/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](CascadeFeatureFusion_0/AuxOutput/weights, save/RestoreV2:1)]]

it seems like after ICNet get compress by filter=0.5, some layer in model can't match anymore. Or maybe this is an issue of tensorflow slim.

Caused by op u'save/Assign_1', defined at:
  File "train_mem_saving.py", line 192, in <module>
    tf.app.run()
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "train_mem_saving.py", line 188, in main
    gradient_checkpoints=checkpoint_nodes)
  File "/home/idata/LDM/test/fast-semantic-segmentation/libs/trainer.py", line 217, in train_segmentation_model
    ignore_missing_vars=True)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 689, in assign_from_checkpoint_fn
    write_version=saver_pb2.SaverDef.V1)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1338, in __init__
    self.build()
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1347, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1384, in _build
    build_save=build_save, build_restore=build_restore)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 835, in _build_internal
    restore_sequentially, reshape)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 494, in _AddRestoreOps
    assign_ops.append(saveable.restore(saveable_tensors, shapes))
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 185, in restore
    self.op.get_shape().is_fully_defined())
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 283, in assign
    validate_shape=validate_shape)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign
    use_locking=use_locking, name=name)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/idata/anaconda3/envs/fastSS_1/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

FAILED TO GET CONVOLUTION ALGORITHIM. RTX 2060, CUDA 10, cuDNN 7.4.2.24 , tensorflow gpu (tb-nightly-gpu)

my env:
windows 10
RTX 2060
py - 3.6.6
tensorflow - tb-nightly-gpu
cuDNN- 7.4.2.24 FOR cuda 10
CUDA 10.0

dataset - cifar10

ERROR:

ANY suggestions?

runfile('D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN/OPEN_CV_CNN001.py', wdir='D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN')
Using TensorFlow backend.
WARNING: Logging before flag parsing goes to stderr.
W0128 19:48:13.545505  5732 deprecation.py:506] From C:\Anaconda3\envs\test1_import\lib\site-packages\keras\backend\tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Epoch 1/1
Traceback (most recent call last):

  File "<ipython-input-1-556c55792f64>", line 1, in <module>
    runfile('D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN/OPEN_CV_CNN001.py', wdir='D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN')

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
    execfile(filename, namespace)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "D:/PY_TF/MNIST_AE_CNN/OPEN_CV_ICUCNN-20190119T181516Z-001/OPEN_CV_ICUCNN/OPEN_CV_CNN001.py", line 64, in <module>
    history = model1.fit(train_imgs, train_ans_one_hot, batch_size = batch_size, epochs= epochs, verbose =1)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\engine\training.py", line 1039, in fit
    validation_steps=validation_steps)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\engine\training_arrays.py", line 199, in fit_loop
    outs = f(ins_batch)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\tensorflow\python\client\session.py", line 1440, in __call__
    run_metadata_ptr)

  File "C:\Anaconda3\envs\test1_import\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 544, in __exit__
    c_api.TF_GetCode(self.status.status))

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node conv2d_1/convolution}}]]
	 [[metrics/acc/Mean/_113]]``

ICNet stage 1 checkpoint

Great work!
A small question, could you also provide the Stage 1 ICNet in Model Zoo?
Did you manage to reproduce the performance of ICNet in their paper?
Thanks!

TensorFlow 1.10 compatible issue

Hi, I think the code have some compatible issue on TensorFlow 1.10 or 1.11 or 1.12:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node Conv/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Conv/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv/weights/read)]]
	 [[{{node predictions_1/_635}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1450_predictions_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "inference.py", line 142, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "inference.py", line 139, in main
    label_map, output_directory)
  File "inference.py", line 95, in run_inference_graph
    feed_dict={placeholder_tensor: image_raw})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node Conv/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py:1057)  = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Conv/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv/weights/read)]]
	 [[{{node predictions_1/_635}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1450_predictions_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'Conv/Conv2D', defined at:
  File "inference.py", line 142, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "inference.py", line 139, in main
    label_map, output_directory)
  File "inference.py", line 82, in run_inference_graph
    label_color_map=label_color_map)
  File "/media/jintian/netac/ai/home/fast-semantic-segmentation/libs/exporter.py", line 64, in deploy_segmentation_inference_graph
    outputs = _get_outputs_from_inputs(model, input_tensor)
  File "/media/jintian/netac/ai/home/fast-semantic-segmentation/libs/exporter.py", line 38, in _get_outputs_from_inputs
    outputs_dict = model.predict(preprocessed_inputs)
  File "/media/jintian/netac/ai/home/fast-semantic-segmentation/architectures/icnet_architecture.py", line 117, in predict
    full_res = self._third_feature_branch(preprocessed_inputs)
  File "/media/jintian/netac/ai/home/fast-semantic-segmentation/architectures/icnet_architecture.py", line 203, in _third_feature_branch
    stride=2, normalizer_fn=slim.batch_norm)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1154, in convolution2d
    conv_dims=2)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1057, in convolution
    outputs = layer.apply(inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 817, in apply
    return self.__call__(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/layers/base.py", line 374, in __call__
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 757, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/convolutional.py", line 194, in call
    outputs = self._convolution_op(inputs, self.kernel)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 868, in __call__
    return self.conv_op(inp, filter)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 520, in __call__
    return self.call(inp, filter)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py", line 204, in __call__
    name=self.name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 957, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node Conv/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py:1057)  = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 2, 2], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Conv/Conv2D-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv/weights/read)]]
	 [[{{node predictions_1/_635}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1450_predictions_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

It would be better if code can upgrade to tensorflow1.11

File not found

ImportError: cannot import name 'dilated_resnet_v1'

there were no such file inside third party folder

ImportError: cannot import name 'hyperparams_pb2'

When I finish compiling it appears to get protos,I get erros:
Traceback (most recent call last):
File "train_mem_saving.py", line 39, in
from builders import model_builder
File "/home/sherry/cuimiao/Fabric_defect_detection/fast-semantic-segmentation/builders/model_builder.py", line 8, in
from builders import hyperparams_builder
File "/home/sherry/cuimiao/Fabric_defect_detection/fast-semantic-segmentation/builders/hyperparams_builder.py", line 9, in
from protos import hyperparams_pb2
ImportError: cannot import name 'hyperparams_pb2'

Quantization aware training of ICNet

Hi,
I am trying to train ICNet without and with quantization. I followed your 2 stage process of training and got 69.3% and 64.7% mIoU for the stage 1 and stage 2.

I would like to re-train stage 2 with quantization in mind. I see that you already have the create training graph inserted in the create_training_model_losses() method.

There are 2 concerns here:

  1. num_clones = 1
    I see that training is in effect and checkpoints are created. The script ends successfully but when I inspect checkpoints, there are no quantization ops. I am expecting names like weights_quant, activation_quant in the example.

  2. num_clones = 2
    I see this error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot update edge, incompatible shapes: [32,9,9,256] and [2,33,33,256].

Have you seen this error? Have you tried using create training graph and has it worked without any issues?

Keen to hear back from you!

Using multi-modal data for training

Hi Oandrienko,

I have tried using the repository for training segmentation models on various datasets such as Cityscapes, camvid and custom datasets as well for various numbers of classes. The model works fine.

However if I try to use multiple datasets (Annotated for same classes) together, it brings the accuracy down to a great extent.
Is there any specific requirement for the data to be captured from the same camera or so?

Prajakta

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.