GithubHelp home page GithubHelp logo

sercant / mobile-segmentation Goto Github PK

View Code? Open in Web Editor NEW
110.0 9.0 20.0 2.8 MB

Real-time semantic image segmentation on mobile devices

License: Apache License 2.0

Python 98.19% Shell 1.81%
mobilenetv2 semantic-segmentation image-processing deeplab-v3-plus mscoco-dataset shufflenet-v2 android real-time tensorflow-lite semantic-image-segmentation

mobile-segmentation's People

Contributors

sercant avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mobile-segmentation's Issues

Error with checkpoint on evaluate.py and visualize.py

Hey, @sercant!

In the case of running the file evaluate.py or visualize.py with your weights (shufflenetv2_dpc_cityscapes_71_3), the following error occurs:

For visualize.py:
NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key aspp0/BatchNorm/beta not found in checkpoint
[[node save/RestoreV2 (defined at visualize.py:276) ]]
[[node save/RestoreV2 (defined at visualize.py:276) ]]

For evaluate.py:
NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key aspp0/BatchNorm/beta not found in checkpoint
[[node save/RestoreV2 (defined at evaluate.py:162) ]]

Errors with visualization

First of all I thank you for good job.

When the file is started visualize.py the following error occurs:

File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape mismatch in tuple component 1. Expected [769,769,3], got [1024,2048,3]
[[{{node batch/padding_fifo_queue_enqueue}}]]

I tried to resize, but it did not help. When I change the size to [1024,2048] the following error appears.
However, the first image is successfully processed.

Traceback (most recent call last):
File "visualize.py", line 335, in
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "visualize.py", line 320, in main
train_id_to_eval_id=train_id_to_eval_id)
File "visualize.py", line 183, in _process_batch
colormap_type=FLAGS.colormap_type)
File "/code/ShuffleNetV2/utils/save_annotation.py", line 33, in save_annotation
label, colormap_type)
File "/code/ShuffleNetV2/utils/get_dataset_colormap.py", line 389, in label_to_color_image
raise ValueError('label value too large.')
ValueError: label value too large.

all xxx.tfrecord files is empty !

Hello, author!
Thank you very much for your open source code.
I currently want to test my own atlas with the pre-training model you provided.
In the process of implementing this task, I first configured the version of tensorflow1.15 and converted the data of the citysacpes data set (png2tfrecord) according to your suggestion.
However, in the process of data conversion, I executed according to your code build_cityscapes_data.py, and the resulting tfrecord files were all empty, as follows:
image
I'd like you to help me with this problem. Thank you!

Three questions about DPC module

Hi,author.Your proposed network is very interesting and effective!
The first questions, the DPC encoder head have 5 franches.
Every franches have 256 channels, then we can get 1280 channels by concat, but in Table 1,
the DPC module's Output Channels are 512. Why?
The second, the rates in DPC is interesting, why are 1X6, 18X15, 6X3, 6X21?I see the deeplabv3+'s rate are 6, 12 ,18.
The last, the each franch in DPC troubles me.I can't understand the reasons.

I'm sorry for asking so many questions, I have an additional question. Why you preprocess the input images by standardizing each pixel to [-1, 1] rather than [0, 1] or not do this operation?

Training my own data

Hi Author!

I'm trying to train using my own data.
I made tfdata and tried to excute train.py.
But I can't define "tf_initial_checkpoint".
Could you please help me out?

The model architecture may different from paper.

Hi, @sercant !

Thank you for sharing this fantastic project!

I found that in core/shufflenet_v2.py line 69 "layer_stride = 1" will cause the parameter "rate" as 1 instead of 2 in shufflenet V2 stage 4.

Could you check this small issue?

Thanks

Run code on the mobile

Hi! I‘m really interested in your paper. Do you plan to implement the code in pytorch? How to run the code on the mobile platform? Thanks!

the tflite file generated by export_tflite.py did not work correctly

Dear Sir,
Thank you for sharing your code, It do help me a lot.
The training and testing part of the model work very well.
But when I start to generate a pb and tflite file, problem comes.
Right now, I use tensorflow with a version of 2.0 beta. Just modify the directory of logs in order to execute the conversion script. As I look deeper into the code, the input node is "input_0" , the output node is "Cast" , which convert the data from int64 to int32. I also notices that, the outputs_to_scales_to_logits is generated by model.predict_labels, which include a operator "ArgMax", it also supported by the latest version of tflite. But the conversion progress did not report any errors, so I test the generated tflie file by the scripy below:

import numpy as np
import tensorflow as tf

Load TFLite model and allocate tensors.

interpreter = tf.lite.Interpreter(model_path="segmentation.tflite")
interpreter.allocate_tensors()

Get input and output tensors.

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

Test model on random input data.

input_shape = input_details[0]['shape']
print(input_shape)
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
print(input_data)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(np.unique(output_data))

I just used a random picture to do the test, willing to see different labels [total 14 classes in my case] sorted by np.unique function. but unfortunately, just one number[4] or two numbers[3,4] is shown. Can you give me some ideas how to solve this problem? Thank you very much!

I also test the tflite file with a test picture, it also did not report expected results.

some thing about the DPC module

@sercant
the DPC module, the input is the ShuffleNet V2 stage4 (that means the input channel is 464) and here, the five branches conv's kernel out_channel is 256(as reffered in the paper "Searching for Efficient Multi-Scale Architectures for Dense Image Prediction") of DPC.
While, as for depthwise_conv2d, the out_channel of convs result is in_channel(inputs )* 256(kernel out_channel). So after 5 braches, we use tf.concat, the final result's out_channel is a very big number ?

Some things about MS COCO 2017

Hi author,I see that you have used COCO 2017 data set for network pre-training processing.
I implemente a semantic segmentation network and download the COCO 2017 data set. I want to know how to get the label images(grey images) from the annotations'json.

COCO2017

Hi ,author.
I really want to get the images like the "the folder of getfine in cityscapes data set".such as
'cityscapes/gtFine/train/aachen/aachen_000000_000019_gtFine_labelIds.png' and corresponding to the image such as 'cityscapes/leftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png'.
I can make txt files to input data.

Pre trained weights

Folks awesome job with being able to do real time segmentation with shufflenet 2.

Do you mind sharing the pretrained weights

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.