huuuuusy / mask-rcnn-shiny Goto Github PK

Python 89.74% Jupyter Notebook 10.26%

mask-rcnn-shiny's Introduction

My visitor

Hi there, I am Shiyu Hu (胡世宇)!

Currently, I am a Research Fellow at Nanyang Technological University (NTU), working with Prof. Kanghao Cheong. Before that, I got my Ph.D. degree at Institute of Automation, Chinese Academy of Sciences (中国科学院自动化研究所) and University of Chinese Academy of Sciences (中国科学院大学) in Jan. 2024, supervised by Prof. Kaiqi Huang (黄凯奇) (IAPR Fellow), co-supervised by Prof. Xin Zhao (赵鑫). I received my master's degree from the Department of Computer Science, the University of Hong Kong (HKU) under the supervision of Prof. Choli Wang (王卓立).

Besides, I am honored to collaborate with a group of outstanding researchers. We have established the Visual Intelligence Interest Group (VIIG) to promote research in related directions.

📣 If you are interested in my research direction or hope to cooperate with me, please contact me! Online or offline cooperations are all welcome ([email protected]). You can download my CV here for more information about my research interests.

About My Github

My Skill Set

Computer Vision

Research

Web Design

Connect with me

mask-rcnn-shiny's People

Contributors

Stargazers

Watchers

mask-rcnn-shiny's Issues

Video modifications

Hello, thank you for this useful project. I noticed that the video processing rotates vertical videos 90 degrees and removes audio. Is there any way to avoid these alterations?

Running model on CPU

Hello @huuuuusy thanks for this great implementation.
I want to run this code on my laptop which does not have GPU, when I set


GPU_COUNT = 0
IMAGES_PER_GPU = 0

It gives the following error
str(inputs) + '. All inputs to the layer ' ValueError: Layer roi_align_classifier was called with an input that isn't a symbolic tensor. Received type: <class 'list'>. Full input: [[], <tf.Tensor 'input_image_meta:0' shape=(?, 93) dtype=float32>, <tf.Tensor 'fpn_p2/BiasAdd:0' shape=(?, ?, ?, 256) dtype=float32>, <tf.Tensor 'fpn_p3/BiasAdd:0' shape=(?, ?, ?, 256) dtype=float32>, <tf.Tensor 'fpn_p4/BiasAdd:0' shape=(?, ?, ?, 256) dtype=float32>, <tf.Tensor 'fpn_p5/BiasAdd:0' shape=(?, ?, ?, 256) dtype=float32>]. All inputs to the layer should be tensors.

Please tell me how I can run this on cpu. And is there any way I can fast this code for inference.

You just replace non-person pixels with their gray mode?

AttributeError: module 'tensorflow' has no attribute 'log'

Hi, I was running the Demo-Image.ipynb and stumbled on this error when running [7] COCO dataset object names:

AttributeError                            Traceback (most recent call last)
<ipython-input-7-cbc8b676b8a3> in <module>
      1 # COCO dataset object names
----> 2 model = modellib.MaskRCNN(
      3     mode="inference", model_dir=MODEL_DIR, config=config
      4 )
      5 model.load_weights(COCO_MODEL_PATH, by_name=True)

~\Desktop\AR_test\models\pretrained-models\R-CNN-mask\Mask-RCNN-Shiny-master\mrcnn\model.py in __init__(self, mode, config, model_dir)
   1821         self.model_dir = model_dir
   1822         self.set_log_dir()
-> 1823         self.keras_model = self.build(mode=mode, config=config)
   1824 
   1825     def build(self, mode, config):

~\Desktop\AR_test\models\pretrained-models\R-CNN-mask\Mask-RCNN-Shiny-master\mrcnn\model.py in build(self, mode, config)
   2014             # Proposal classifier and BBox regressor heads
   2015             mrcnn_class_logits, mrcnn_class, mrcnn_bbox =\
-> 2016                 fpn_classifier_graph(rpn_rois, mrcnn_feature_maps, input_image_meta,
   2017                                      config.POOL_SIZE, config.NUM_CLASSES,
   2018                                      train_bn=config.TRAIN_BN)

~\Desktop\AR_test\models\pretrained-models\R-CNN-mask\Mask-RCNN-Shiny-master\mrcnn\model.py in fpn_classifier_graph(rois, feature_maps, image_meta, pool_size, num_classes, train_bn)
    916     # ROI Pooling
    917     # Shape: [batch, num_boxes, pool_height, pool_width, channels]
--> 918     x = PyramidROIAlign([pool_size, pool_size],
    919                         name="roi_align_classifier")([rois, image_meta] + feature_maps)
    920     # Two 1024 FC layers (implemented with Conv2D for consistency)

c:\users\tukuru0005\pythonstuff\envs\imageseg\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in __call__(self, *args, **kwargs)
    920                     not base_layer_utils.is_in_eager_or_tf_function()):
    921                   with auto_control_deps.AutomaticControlDependencies() as acd:
--> 922                     outputs = call_fn(cast_inputs, *args, **kwargs)
    923                     # Wrap Tensors in `outputs` in `tf.identity` to avoid
    924                     # circular dependencies.

c:\users\tukuru0005\pythonstuff\envs\imageseg\lib\site-packages\tensorflow\python\autograph\impl\api.py in wrapper(*args, **kwargs)
    263       except Exception as e:  # pylint:disable=broad-except
    264         if hasattr(e, 'ag_error_metadata'):
--> 265           raise e.ag_error_metadata.to_exception(e)
    266         else:
    267           raise

AttributeError: in user code:

    C:\Users\tukuru0005\Desktop\AR_test\models\pretrained-models\R-CNN-mask\Mask-RCNN-Shiny-master\mrcnn\model.py:387 call  *
        roi_level = log2_graph(tf.sqrt(h * w) / (224.0 / tf.sqrt(image_area)))
    C:\Users\tukuru0005\Desktop\AR_test\models\pretrained-models\R-CNN-mask\Mask-RCNN-Shiny-master\mrcnn\model.py:338 log2_graph  *
        return tf.log(x) / tf.log(2.0)

    AttributeError: module 'tensorflow' has no attribute 'log'

I tried both my PC (using different tensorflow versions) and google colab but it throws a similar error on colab:

AttributeError                            Traceback (most recent call last)
<ipython-input-21-cbc8b676b8a3> in <module>()
      1 # COCO dataset object names
      2 model = modellib.MaskRCNN(
----> 3     mode="inference", model_dir=MODEL_DIR, config=config
      4 )
      5 model.load_weights(COCO_MODEL_PATH, by_name=True)

6 frames
/content/Mask-RCNN-Shiny/mrcnn/model.py in __init__(self, mode, config, model_dir)
   1821         self.model_dir = model_dir
   1822         self.set_log_dir()
-> 1823         self.keras_model = self.build(mode=mode, config=config)
   1824 
   1825     def build(self, mode, config):

/content/Mask-RCNN-Shiny/mrcnn/model.py in build(self, mode, config)
   2016                 fpn_classifier_graph(rpn_rois, mrcnn_feature_maps, input_image_meta,
   2017                                      config.POOL_SIZE, config.NUM_CLASSES,
-> 2018                                      train_bn=config.TRAIN_BN)
   2019 
   2020             # Detections

/content/Mask-RCNN-Shiny/mrcnn/model.py in fpn_classifier_graph(rois, feature_maps, image_meta, pool_size, num_classes, train_bn)
    917     # Shape: [batch, num_boxes, pool_height, pool_width, channels]
    918     x = PyramidROIAlign([pool_size, pool_size],
--> 919                         name="roi_align_classifier")([rois, image_meta] + feature_maps)
    920     # Two 1024 FC layers (implemented with Conv2D for consistency)
    921     x = KL.TimeDistributed(KL.Conv2D(1024, (pool_size, pool_size), padding="valid"),

/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in symbolic_fn_wrapper(*args, **kwargs)
     73         if _SYMBOLIC_SCOPE.value:
     74             with get_graph().as_default():
---> 75                 return func(*args, **kwargs)
     76         else:
     77             return func(*args, **kwargs)

/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py in __call__(self, inputs, **kwargs)
    487             # Actually call the layer,
    488             # collecting output(s), mask(s), and shape(s).
--> 489             output = self.call(inputs, **kwargs)
    490             output_mask = self.compute_mask(inputs, previous_mask)
    491 

/content/Mask-RCNN-Shiny/mrcnn/model.py in call(self, inputs)
    385         # e.g. a 224x224 ROI (in pixels) maps to P4
    386         image_area = tf.cast(image_shape[0] * image_shape[1], tf.float32)
--> 387         roi_level = log2_graph(tf.sqrt(h * w) / (224.0 / tf.sqrt(image_area)))
    388         roi_level = tf.minimum(5, tf.maximum(
    389             2, 4 + tf.cast(tf.round(roi_level), tf.int32)))

/content/Mask-RCNN-Shiny/mrcnn/model.py in log2_graph(x)
    336 def log2_graph(x):
    337     """Implementatin of Log2. TF doesn't have a native implemenation."""
--> 338     return tf.log(x) / tf.log(2.0)
    339 
    340 

AttributeError: module 'tensorflow' has no attribute 'log'

according to this reply we should replace tf.log() by tf.math.log(), I haven't tried it yet but it may be good to update the repository as well

Works well with slow-moving single focal point style content.

Cool. Seems to work well with slow-moving single focal point style content.

You might also want to check out background blur projects using google deeplab (tf) as well:
https://github.com/zubairahmed-ai/Live-Background-Blur

mask-rcnn-shiny:

deeplab:

Process with video is slow

Hi @huuuuusy , your project is great. The process with images works perfectly, but when I use it with videos I found it too slow although I got a GPU. Could you give me know the way to solve it, I mean speed it up. Thank you

Error in Demo-Image.ipynb file in last cell.

UnboundLocalError Traceback (most recent call last)
in ()
1 results = model.detect([image], verbose=0)
2 r = results[0]
----> 3 frame = display_instances(image, r['rois'], r['masks'], r['class_ids'], class_names, r['scores'])
4 cv2_imshow(frame)
5

in display_instances(image, boxes, masks, ids, names, scores)
35 # apply mask for the image
36 # by mistake you put apply_mask inside for loop or you can write continue in if also
---> 37 image = apply_mask(image, mask)
38
39 return image

UnboundLocalError: local variable 'mask' referenced before assignment

Is this just input a normal color image and segment out the person?

Does this need a scratch training?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.