GithubHelp home page GithubHelp logo

zhanggang001 / refinemask Goto Github PK

View Code? Open in Web Editor NEW
210.0 5.0 32.0 3.39 MB

RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features (CVPR 2021)

License: Apache License 2.0

Python 99.83% Dockerfile 0.03% Shell 0.14%
instance-segmentation refinement boundary object-detection

refinemask's People

Contributors

zhanggang001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

refinemask's Issues

Test Representation

I'm going to put the refinemask to the task of building extraction, but there exists a problem when presenting the test results. Obviously, many predicted boxes are overlay so I debug the code and find that the a nms operation is missed to process the bounding boxes. Besides, masks are not given in some masks and I still cannot find a reason to explain.
1

About the implementation details of using refinemask head in Cascade R-CNN or HTC

Thanks for your insightful work, I try to add the refinemask head into Cascade R-CNN and HTC and I have two questions:

  1. In the inference process of Cascade R-CNN or HTC, the mask pred are the average result of the mask in each stage (3 stage in Cascade). So, if I only add the refinemask head in the last stage, the output size of the mask preds for 3 cascade stages will be: [28, 28 and 112], which cannot be averaged directly. What should I do to obtain the same size of mask results in all stages, upsampling the 28 to 112 or downsampling the 112 to 28, or just use the mask results in the last cascade stage.

  2. There are both semantic head in HTC and your refinemak head. The most important difference is that the input feature level used is different, HTC uses the penultimate level feature as the input and refinemask adopts the frist one, is it possible to unify these two semantic heads?

Happy Spring Festival!

Train refinemask witth resnet 18 backbone

Hello,

I would like to train refinemask with resnet18 and I'm trying it but I'm getting the below error.
Please help me to fix this issue. Thank you in advance!

File "tools/train.py", line 165, in
main()
File "tools/train.py", line 161, in main
meta=meta)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/apis/train.py", line 150, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 246, in train_step
losses = self(**data)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 180, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/two_stage.py", line 142, in forward_train
x = self.extract_feat(img)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/two_stage.py", line 84, in extract_feat
x = self.neck(x)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/necks/fpn.py", line 172, in forward
for i, lateral_conv in enumerate(self.lateral_convs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/necks/fpn.py", line 172, in
for i, lateral_conv in enumerate(self.lateral_convs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py", line 185, in forward
x = self.conv(x)
File "/home/ec2-user/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 419, in forward
return self._conv_forward(input, self.weight)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [256, 256, 1, 1], expected input[1, 64, 200, 272] to have 256 channels, but got 64 channels instead

how to run inference without providing json files

Hello,
Thank you for provising all answers of my questions previously.
Successfully, I run training and testing with json files. I have a question that How can I run inference without providing json files and just loading the model and its states and get prediction in any mode? It would be helpful to know more insights about refine mask.
Thank you in advance.

loss decreases slowly

Hi, @zhanggang001 ,

Thanks for sharing your impressive work.

I implement your code to train my own dataset with 4 GPU cards. Since your code only supports one image per GPU, I guess it would be the reason why the loss decrease so slowly. So, I wonder to know when you could release the code supporting multiple images per GPU batch.

Many thanks.

gpu

excuse me,

Does the author run the code on a single GPU? and it takes how many days?

mmdetection vs. detectron setting

Hi thanks for the nice work.
I just wonder after seeing this issue SysCV/transfiner#11 (comment) it makes me to ask two questions

  1. Is there any model from Table 8 (SOTA comparison) trained on detectron2?
  2. Also normally mmdetection is run by EpochBasedRunner and detectron2 is run by iteration. (Additionally many different hyperparameter setting) Do you think default setting comapring between two libraray is fair comparison?

About get flops

I would like to know how to get the flops of the model
When using get_flops.py in tools to get flops, there is an error here

  File "/mmdet/models/detectors/two_stage.py", line 88, in forward_dummy
    roi_outs = self.roi_head.forward_dummy(x, proposals)
TypeError: forward_dummy() missing 1 required positional argument: 'roi_labels'
This is def forward_dummy in two_stage.py:
    def forward_dummy(self, img):
        """Used for computing network flops.

        See `mmdetection/tools/analysis_tools/get_flops.py`
        """
        outs = ()
        # backbone
        x = self.extract_feat(img)
        # rpn
        if self.with_rpn:
            rpn_outs = self.rpn_head(x)
            outs = outs + (rpn_outs, )
        proposals = torch.randn(1000, 4).to(img.device)
        # roi_head
        # roi_labels = torch.zeros(1000)
        roi_outs = self.roi_head.forward_dummy(x, proposals)  # roi_labels
        outs = outs + (roi_outs, )
        return outs

refinemask adds roi_labels in def _mask_forward, refine_roi_head.py during training
I tried defining roi_labels with all 0

        roi_labels = torch.zeros(1000)
        roi_outs = self.roi_head.forward_dummy(x, proposals, roi_labels)

but have error

  File "/mmdet/mmdet/models/roi_heads/mask_heads/refine_mask_head.py", line 561, in forward
    instance_logits = self.stage_instance_logits[idx](instance_feats)[torch.arange(len(rois)), roi_labels][:, None]
IndexError: tensors used as indices must be long, byte or bool tensors

when I change the type to bool

  File "/mmdet/mmdet/models/roi_heads/mask_heads/refine_mask_head.py", line 561, in forward
    instance_logits = self.stage_instance_logits[idx](instance_feats)[torch.arange(len(rois)), roi_labels][:, None]
IndexError: The shape of the mask [1000] at index 0 does not match the shape of the indexed tensor [100, 1, 14, 14] at index 1

How to implement the calculation, hope you can help me!
I don't understand the dimension and type of the roi_labels

# this is forward_dummy in mmdet/models/roi_heads/standard_roi_head.py
    def forward_dummy(self, x, proposals):
        """Dummy forward function."""
        # bbox head
        outs = ()
        rois = bbox2roi([proposals])
        if self.with_bbox:
            bbox_results = self._bbox_forward(x, rois)
            outs = outs + (bbox_results['cls_score'],
                           bbox_results['bbox_pred'])
        # mask head
        if self.with_mask:
            mask_rois = rois[:100]
            mask_results = self._mask_forward(x, mask_rois)
            outs = outs + (mask_results['mask_pred'], )
        return outs

TypeError: can't pickle _thread.RLock objects

When I use this command to test Lvis,

./scripts/dist_test.sh ./configs/refinemask/lvis/r50-refinemask-1x.py ../VOS_Model/mmdet/r50-lvis-1x.pth 8 --eval segm`

the following error occurred:

Traceback (most recent call last):
  File "/mnt/SSD/RefineMask/tools/test.py", line 152, in <module>
    main()
  File "/mnt/SSD/RefineMask/tools/test.py", line 137, in main
    outputs = multi_gpu_test(model, data_loader, args.tmpdir, args.gpu_collect)
  File "/mnt/SSD/RefineMask/mmdet/apis/test.py", line 92, in multi_gpu_test
    for i, data in enumerate(data_loader):
  File "/home/anaconda3/envs/VOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 291, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/home/anaconda3/envs/VOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 737, in __init__
    w.start()
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
Traceback (most recent call last):
  File "/mnt/SSD/RefineMask/tools/test.py", line 152, in <module>
    return Popen(process_obj)
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
    main()
    ForkingPickler(file, protocol).dump(obj)
  File "/mnt/SSD/RefineMask/tools/test.py", line 137, in main
TypeError: can't pickle _thread.RLock objects

Have you ever encountered this problem?

datasets

Why are your datasets in PNG format and different from my datasets
The first data set of coco is 4134,31817, but mine is 9,25
can you give me a download path?
sorry about my so many question
look forward your response

Attribute error: list object has no attribute shape (occur during the evaluation after 12 epochs)

Hello,
I'm able to run refine mask successfully using mmcv-full 1.3.12 and mmdet - 2.6.0 on windows. It runs fine for training and giving the below error during evaluation after 12 epochs. If possible then can you help me out on this and let me know what mistake I'm doing particularly? Thank you in advance.

File "tools/test.py", line 241, in
main()
File "tools/test.py", line 206, in main
outputs = single_gpu_test(model, data_loader, args.show, args.show_dir,
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\apis\test.py", line 28, in single_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmcv\parallel\data_parallel.py", line 42, in forward
return super().forward(*inputs, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\torch\nn\parallel\data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmcv\runner\fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\detectors\base.py", line 174, in forward
return self.forward_test(img, img_metas, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\detectors\base.py", line 147, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\detectors\two_stage.py", line 182, in simple_test
return self.roi_head.simple_test(
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\roi_heads\standard_roi_head.py", line 264, in simple_test
segm_results = self.simple_test_mask(
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\roi_heads\refine_roi_head.py", line 92, in simple_test_mask
if det_bboxes.shape[0] == 0:
AttributeError: 'list' object has no attribute 'shape'

About simplehead

nice work ! I have a question about SimpleRefineMaskHead in refine_mask_head.py
There is not mentioned in the article and README. May I ask how its performance is different from the normal model?
What changes have been made to the simple model
Looking forward to your reply!

Cannot deploy model to other inference engine (e.g. ONNX, OpenVINO)

Hello,
I'm interested in the RefineMask method and I'd like to use it to cut out the detected image from pictures.
I could train with this repository, but unfortunately, I cannot deploy the trained model to other inference engine. I'm afraid that it is because this official implementation is based on too old MMDetection code (v2.3.0).

I tried with tools/pytorch2onnx.py but the result is failed.
I also tried with MMdeploy v0.14.0 but the result is failed too.
At last, I tried to implement RefineMask into MMDetection v2.28.2 (latest version of 2.x) and tried with MMdeploy v0.14.0. I think I could implement correctly, and it works on MMDetection v2.28.2, but converting the model is failed.

All of them can inference on the MMDetection, but it is failed when converting model with MMDeploy, at torch.jit.trace and torch.jit.script.
I tried with Python 3.7, PyTorch 1.13.1 and CUDA 11.7.

Are there any solution to convert RefineMask pretrained .pth model to .onnx model or other formats?
Or, if anyone knows, could you tell me the implementation to other train/inference platforms?

Where is pretrained model for 3x?

Hi according to your paper all the highest AP was collected from 3x training which are denoted as $\ddagger$.
Can we get also the pretrained weight of those models? It would be grateful if we can have one to test.

Thanks :)

Training on PascalVOC

Hi! I am rather eager to train your model on thePascalVOC 2012 dataset. Could you please let me know the steps for the same?

测试的时候不显示分割结果

训练完毕进行测试的时候,测试展示的结果图仍让是原图,没有任何分割的结果,请问可能是什么原因造成的?

mmcv-full 1.3.12 is used but imcompatible

Hello,

I'm trying to run RefineMask repository. I have a question regarding this. Is it only run with torch 1.5.0 and mmcv-full 1.5.0 versions? Because I have higher version of torch (1.8.1+cu102) and mmcv-full (1.3.12) and when I run it then it gives an error mmcv 1.3.12 is used but incompatible. I try to install mmcv-full 1.5.0 but it throws an error. Can you guide me how should I proceed further?
Thank you in advance.

RefineMask for NVIDIA Ampere architecture

Good afternoon:

Congratulations on your work.
I checked that it is mandatory to use mmcv==1.0.5. However, new NVIDIA graphic cards (starting in May 2020) needs to use CUDA 11.0 at least so as to perform graphic acceleration of the algorithms.

I read several of the resolved issues and it does not seem that RefineMask could be moved outside mmcv==1.0.5. This version supports CUDA 10.2 as much but no CUDA 11.X version. Do you plan at some point in the near future to upgrade the code in order to use it in new generation graphic cards?

Thanks a lot.

Ignacio.

MMDET ValueError: need at least one array to stack

I got below mentioned error at 2022-02-02 05:32:05,118 - mmdet - INFO - Epoch [1][900/13322] while training my custom coco dataset which have polygon masking in annotations. All images have segmentation data in annotations so there are no empty gt_masks instances. what should I do to solve this error and continue training?

 File "/content/RefineMask/mmdet/models/roi_heads/mask_heads/refine_mask_head.py", line 284, in get_targets
  instance_masks = torch.from_numpy(np.stack(instance_masks)).to(device=pos_bboxes.device, dtype=torch.float32)
  File "<__array_function__ internals>", line 6, in stack
  File "/usr/local/lib/python3.7/dist-packages/numpy/core/shape_base.py", line 423, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack 

No segmentation result after single-class sample training, boundary box support only

Hello, I build a new single-class dataset, training based on r101-refinemask-2x, and the modified configs are as follows.

However, after 93 epochs, I was stuck in some trouble. On the one hand, I found the detection results had no segmentation mask, and on the other hand, loss_instance was really difficult to converge. For example, {"mode": "train", "epoch": 94, "iter": 10, "lr": 3e-05, "time": 0.57583, "data_time": 0.2215, "memory": 5419, "loss_rpn_cls": 0.00174, "loss_rpn_bbox": 0.00334, "loss_cls": 0.01584, "acc": 99.6875, "loss_bbox": 0.03414, "loss_instance": 0.26567, "loss_semantic": 0.00546, "loss": 0.3262, "grad_norm": 2.08605}

I have two questions now:
1. Does refineMask support instance segmentation of elongated objects, or is there something wrong with my configuration file?
2. Any recommended epochs of training?
Thanks for your time.

model = dict(
type='MaskRCNN',
pretrained='torchvision://resnet101',
backbone=dict(
type='ResNet',
depth=101,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
roi_head=dict(
type='RefineRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=1,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0),
loss_bbox=dict(type='L1Loss', loss_weight=2.0)),
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=dict(
type='RefineMaskHead',
num_convs_instance=2,
num_convs_semantic=4,
conv_in_channels_instance=256,
conv_in_channels_semantic=256,
conv_kernel_size_instance=3,
conv_kernel_size_semantic=3,
conv_out_channels_instance=256,
conv_out_channels_semantic=256,
conv_cfg=None,
norm_cfg=None,
dilations=[1, 3, 5],
semantic_out_stride=4,
mask_use_sigmoid=True,
stage_num_classes=[1, 1, 1, 1],
stage_sup_size=[14, 28, 56, 112],
upsample_cfg=dict(type='bilinear', scale_factor=2),
loss_cfg=dict(
type='RefineCrossEntropyLoss',
stage_instance_loss_weight=[0.25, 0.5, 0.75, 1.0],
semantic_loss_weight=1.0,
boundary_width=2,
start_stage=1))))
train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False))
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5))
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='Resize', img_scale=(1024, 1024), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
]
test_pipeline = [
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data_root = '../coco'
data = dict(
samples_per_gpu=1,
workers_per_gpu=2,
train=dict(
type='CocoDataset',
ann_file='annotations/instances_train2017.json',
img_prefix='train2017',
pipeline=[
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='Resize', img_scale=(1024, 1024), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
],
data_root='../coco',
classes=('street', )),
val=dict(
type='CocoDataset',
ann_file='annotations/instances_train2017.json',
img_prefix='train2017',
pipeline=[
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
data_root='../coco',
classes=('street', )),
test=dict(
type='CocoDataset',
ann_file='annotations/instances_train2017.json',
img_prefix='train2017',
pipeline=[
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
data_root='../coco',
classes=('street', )))
evaluation = dict(metric=['bbox', 'segm'], classwise=True, interval=12)
optimizer = dict(type='SGD', lr=0.003, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
total_epochs = 1000
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[16, 22])
checkpoint_config = dict(interval=1)
log_config = dict(interval=10, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
workflow = [('train', 1)]
gpu_ids = range(0, 1)
work_dir = 'work_dirs/r101-refinemask-2x/street'
load_from = None
resume_from = 'work_dirs/r101-refinemask-2x/street/latest.pth'
classes = ('street', )

改变可视化结果的颜色及字体

您好,我想把可视化结果中同一类别的不同实例改成不同颜色,以及检测框和文本的颜色,请问在哪个文件中设置的,谢谢

The testing results of the whole dataset is empty.

First of all, thank you very much for sharing your work. I tested on coco dataset, but the test result showed that the whole dataset was empty. May I ask how to solve this problem? thanks.

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 18.3 task/s, elapsed: 273s, ETA: 0s
writing results to work_dirs/r50-refinemask-1x/result.pkl

Evaluating bbox...
Loading and preparing results...
The testing results of the whole dataset is empty.
load gt json
load pred json
Traceback (most recent call last):
File "/data/user-njf86/ding/RefineMask-main/tools/test.py", line 153, in
main()
File "/data/user-njf86/ding/RefineMask-main/tools/test.py", line 149, in main
dataset.evaluate(outputs, args.eval, classwise=True, **kwargs)
File "/data/user-njf86/ding/RefineMask-main/mmdet/datasets/coco.py", line 556, in evaluate
metric='segm',
File "/data/user-njf86/ding/RefineMask-main/mmdet/datasets/coco.py", line 655, in eval_cocofied_lvis_result
lvis_dt = LVISResults(lvis_gt, result_file)
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/site-packages/lvis/results.py", line 44, in init
if 'bbox' in result_anns[0]:
IndexError: list index out of range
Traceback (most recent call last):
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/site-packages/torch/distributed/launch.py", line 260, in
main()
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/site-packages/torch/distributed/launch.py", line 256, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/user-njf/anaconda3/envs/mmdet_test/bin/python', '-u', '/data/user-njf86/ding/RefineMask-main/tools/test.py', '--local_rank=0', './configs/refinemask/coco/r50-refinemask-1x.py', 'work_dirs/r50-refinemask-1x/latest.pth', '--launcher', 'none', '--eval', 'bbox', 'segm', '--out', 'work_dirs/r50-refinemask-1x/result.pkl', '--tmpdir', 'work_dirs/r50-refinemask-1x/tmpdir']' returned non-zero exit status 1.

RuntimeError: CUDA error: no kernel image is available for execution on the device

Hi! While training on the coco dataset using the specified file I get the following error log:
train_detector(
File "/home2/kevins99/RefineMask/mmdet/apis/train.py", line 145, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in train
outputs = self.model.train_step(data_batch, self.optimizer,
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 31, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home2/kevins99/RefineMask/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/models/detectors/base.py", line 171, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/models/detectors/two_stage.py", line 150, in forward_train
rpn_losses, proposal_list = self.rpn_head.forward_train(
File "/home2/kevins99/RefineMask/mmdet/models/dense_heads/base_dense_head.py", line 58, in forward_train
proposal_list = self.get_bboxes(*outs, img_metas, cfg=proposal_cfg)
File "/home2/kevins99/RefineMask/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(*args, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/models/dense_heads/anchor_head.py", line 572, in get_bboxes
proposals = self._get_bboxes_single(cls_score_list, bbox_pred_list,
File "/home2/kevins99/RefineMask/mmdet/models/dense_heads/rpn_head.py", line 167, in get_bboxes_single
dets, keep = batched_nms(proposals, scores, ids, nms_cfg)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/nms.py", line 243, in batched_nms
dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg
)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/utils/misc.py", line 262, in new_func
output = old_func(*args, **kwargs)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/nms.py", line 113, in nms
inds = NMSop.apply(boxes, scores, iou_threshold, offset)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/nms.py", line 17, in forward
inds = ext_module.nms(
RuntimeError: CUDA error: no kernel image is available for execution on the device

Following the mmcv documentation, I have built it again with the same CUDA version.
Link: https://mmcv.readthedocs.io/en/latest/trouble_shooting.html#:~:text=%E2%80%9CRuntimeError%3A%20nms%20is%20not%20compiled,folder%20before%20re%2Dcompile%20mmcv.

My enviorment seems to be correctly setup. The environment details are as follows:

Python: 3.8.8 (default, Apr 13 2021, 19:58:26) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0: GeForce GTX 1080 Ti
GCC: gcc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
PyTorch: 1.8.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.0
OpenCV: 4.5.2
MMCV: 1.0.5
MMDetection: ('2.3.0',)
MMDetection Compiler: GCC 5.5
MMDetection CUDA Compiler: 10.2

Finally, I have tried PyTorch 1.5.0 along with mmcv-full 1.0.5 which resulted in the following error log:
Traceback (most recent call last):
File "mmdet/utils/collect_env.py", line 67, in
for name, val in collect_env().items():
File "mmdet/utils/collect_env.py", line 60, in collect_env
from mmcv.ops import get_compiler_version, get_compiling_cuda_version
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/init.py", line 1, in
from .bbox import bbox_overlaps
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/bbox.py", line 3, in
ext_module = ext_loader.load_ext('_ext', ['bbox_overlaps'])
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 10, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: /home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _Z13__THCudaCheck9cudaErrorPKci

All mmcv modules have been built with the correct PyTorch and cuda versions using the command
pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html

Help would be greatly appreciated.
Many thanks in advance!

TypeError: init_weights() missing 1 required positional argument: 'pretrained'

Traceback (most recent call last):
File "tools/train.py", line 188, in
main()
File "tools/train.py", line 162, in main
model.init_weights()
File "/home/ubuntu/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/base_module.py", line 55, in init_weights
m.init_weights()
TypeError: init_weights() missing 1 required positional argument: 'pretrained'

The loss suddenly boom

Hi, @zhanggang001 ,I‘m still a beginning level learner and I have a question that don't know how to fix it.
Without changing the RefineMask config(except DataRoot and num_classes),the loss suddenly boom in epoch1 or epoch2

2021-05-25 09:52:39,591 - mmdet - INFO - Epoch [1][550/1445] lr: 2.000e-02, eta: 3:27:36, time: 0.354, data_time: 0.002, memory: 5616, loss_rpn_cls: 0.1940, loss_rpn_bbox: 0.1124, loss_cls: 0.6903, acc: 89.3125, loss_bbox: 0.7711, loss_instance: 1.7145, loss_semantic: 1.1350, loss: 4.6172, grad_norm: 49.7330
2021-05-25 09:52:58,284 - mmdet - INFO - Epoch [1][600/1445] lr: 2.000e-02, eta: 3:27:43, time: 0.374, data_time: 0.003, memory: 5616, loss_rpn_cls: 0.1819, loss_rpn_bbox: 0.1232, loss_cls: 0.7582, acc: 88.1094, loss_bbox: 0.8542, loss_instance: 1.6824, loss_semantic: 0.7272, loss: 4.3272, grad_norm: 6.2210
2021-05-25 09:53:13,881 - mmdet - INFO - Epoch [1][650/1445] lr: 2.000e-02, eta: 3:25:04, time: 0.312, data_time: 0.003, memory: 5616, loss_rpn_cls: 10.1637, loss_rpn_bbox: 6.9990, loss_cls: 34.9405, acc: 91.4102, loss_bbox: 13.7452, loss_instance: 234.7180, loss_semantic: 7664.2850, loss: 7964.8515, grad_norm: 340497.7997
2021-05-25 09:53:27,481 - mmdet - INFO - Epoch [1][700/1445] lr: 2.000e-02, eta: 3:21:08, time: 0.272, data_time: 0.003, memory: 5616, loss_rpn_cls: 35289.9099, loss_rpn_bbox: 44193.1067, loss_cls: 875153.7309, acc: 94.6094, loss_bbox: 1115768.8280, loss_instance: 22061.8204, loss_semantic: 499100.5092, loss: 2591567.9840, grad_norm: 112257859.0582
2021-05-25 09:53:41,763 - mmdet - INFO - Epoch [1][750/1445] lr: 2.000e-02, eta: 3:18:13, time: 0.286, data_time: 0.003, memory: 5616, loss_rpn_cls: 20893164722860.7344, loss_rpn_bbox: 7885642268278.1553, loss_cls: 1349368483407166.5000, acc: 86.1094, loss_bbox: 627283726521955.8750, loss_instance: 4757506602120.5703, loss_semantic: 3351921805304.0806, loss: 2013540377899481.0000, grad_norm: 117931859697361120.0000
2021-05-25 09:53:56,211 - mmdet - INFO - Epoch [1][800/1445] lr: 2.000e-02, eta: 3:15:45, time: 0.289, data_time: 0.003, memory: 5616, loss_rpn_cls: 52192068187424022801154048.0000, loss_rpn_bbox: 61156605795124256318685184.0000, loss_cls: 12385031694969291828216463360.0000, acc: 27.7930, loss_bbox: 2810178410764419893789982720.0000, loss_instance: 320446566729177349750784.0000, loss_semantic: 7193256245805316776656896.0000, loss: 15316072270766823574903717888.0000, grad_norm: inf

Thanks for your watching, I'm looking forward for your answering :)

loss nan in Lvis and coco

I use 8GPU training the model, the lr is set to 0.02 following your config, but the loss is nan;
when I set the lr to 0.0025 in another issue you answered, the loss is normal.
can you give me some help, thanks

The adjustment of learning rate

Thanks for the nice work. I have a question for you. Your default is 8 cards, and you set the learning rate to 0.02. If I conduct the experiment from multiple cards to single card, does the learning rate need to be adjusted? I have tried to adjust the learning rate to 0.0025 and 0.005 for experiments, but the results are not satisfactory. May I ask if you have any suggestions on the adjustment of learning rate when switching from multiple cards to single cards? Looking forward to your reply.

performance worse than Mask R-CNN

Hello,

First of all thank you for sharing your work!
I tried to train a model on my own dataset (wildlife animal data) with different hyperparameters but the performance does not reach Mask R-CNN. (0.66 AP with RefineMask against 0.7 AP with Mask R CNN)
I trained it on a single gpu.
In another issue you wrote that this isnt possible, could the lower batchsize be the reason for the bad performance?

Thank you in advance!

About the samples_per_gpu and learning rate of Cityscapes dataset

I see the samples_per_gpu of Cityscapes is change to 2, but the learning rate is still set to 0.01. I understand the modification of samples_per_gpu of COCO dataset, for the batchsize is 16 (8 GPU * 2 samples_per_gpu) and corresponding to learning rate of 0.02. But I wonder when the samples_per_gpu of Cityscapes dataset is 2, the learning rate should also change?

how to apply refine mask in cascade mask rcnn / HTC?

hello!
thanks for your beautiful work.
I have a question that how to apply refine mask in cascade mask rcnn / HTC?
Cause there are three stage in HTC, should I apply refine mask in the last stage or all the three stage?
thanks for your reply~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.