zhanggang001 / refinemask Goto Github PK
View Code? Open in Web Editor NEWRefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features (CVPR 2021)
License: Apache License 2.0
RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features (CVPR 2021)
License: Apache License 2.0
In your paper you don't compare against CascadePSP. I was wondering if you could evaluate your method vs theirs? Thanks!
I'm going to put the refinemask to the task of building extraction, but there exists a problem when presenting the test results. Obviously, many predicted boxes are overlay so I debug the code and find that the a nms operation is missed to process the bounding boxes. Besides, masks are not given in some masks and I still cannot find a reason to explain.
Thanks for your insightful work, I try to add the refinemask head into Cascade R-CNN and HTC and I have two questions:
In the inference process of Cascade R-CNN or HTC, the mask pred are the average result of the mask in each stage (3 stage in Cascade). So, if I only add the refinemask head in the last stage, the output size of the mask preds for 3 cascade stages will be: [28, 28 and 112], which cannot be averaged directly. What should I do to obtain the same size of mask results in all stages, upsampling the 28 to 112 or downsampling the 112 to 28, or just use the mask results in the last cascade stage.
There are both semantic head in HTC and your refinemak head. The most important difference is that the input feature level used is different, HTC uses the penultimate level feature as the input and refinemask adopts the frist one, is it possible to unify these two semantic heads?
Happy Spring Festival!
Is mmcv>=2.0.0 and the newest mmdet3.0 compatible with this code?
Hello,
I would like to train refinemask with resnet18 and I'm trying it but I'm getting the below error.
Please help me to fix this issue. Thank you in advance!
File "tools/train.py", line 165, in
main()
File "tools/train.py", line 161, in main
meta=meta)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/apis/train.py", line 150, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 246, in train_step
losses = self(**data)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 180, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/two_stage.py", line 142, in forward_train
x = self.extract_feat(img)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/detectors/two_stage.py", line 84, in extract_feat
x = self.neck(x)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/necks/fpn.py", line 172, in forward
for i, lateral_conv in enumerate(self.lateral_convs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmdet/models/necks/fpn.py", line 172, in
for i, lateral_conv in enumerate(self.lateral_convs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py", line 185, in forward
x = self.conv(x)
File "/home/ec2-user/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 419, in forward
return self._conv_forward(input, self.weight)
File "/home/Anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [256, 256, 1, 1], expected input[1, 64, 200, 272] to have 256 channels, but got 64 channels instead
Hello,
Thank you for provising all answers of my questions previously.
Successfully, I run training and testing with json files. I have a question that How can I run inference without providing json files and just loading the model and its states and get prediction in any mode? It would be helpful to know more insights about refine mask.
Thank you in advance.
Hi, @zhanggang001 ,
Thanks for sharing your impressive work.
I implement your code to train my own dataset with 4 GPU cards. Since your code only supports one image per GPU, I guess it would be the reason why the loss decrease so slowly. So, I wonder to know when you could release the code supporting multiple images per GPU batch.
Many thanks.
excuse me,
Does the author run the code on a single GPU? and it takes how many days?
Hi thanks for the nice work.
I just wonder after seeing this issue SysCV/transfiner#11 (comment) it makes me to ask two questions
I would like to know how to get the flops of the model
When using get_flops.py
in tools
to get flops, there is an error here
File "/mmdet/models/detectors/two_stage.py", line 88, in forward_dummy
roi_outs = self.roi_head.forward_dummy(x, proposals)
TypeError: forward_dummy() missing 1 required positional argument: 'roi_labels'
This is def forward_dummy in two_stage.py:
def forward_dummy(self, img):
"""Used for computing network flops.
See `mmdetection/tools/analysis_tools/get_flops.py`
"""
outs = ()
# backbone
x = self.extract_feat(img)
# rpn
if self.with_rpn:
rpn_outs = self.rpn_head(x)
outs = outs + (rpn_outs, )
proposals = torch.randn(1000, 4).to(img.device)
# roi_head
# roi_labels = torch.zeros(1000)
roi_outs = self.roi_head.forward_dummy(x, proposals) # roi_labels
outs = outs + (roi_outs, )
return outs
refinemask adds roi_labels in def _mask_forward, refine_roi_head.py
during training
I tried defining roi_labels with all 0
roi_labels = torch.zeros(1000)
roi_outs = self.roi_head.forward_dummy(x, proposals, roi_labels)
but have error
File "/mmdet/mmdet/models/roi_heads/mask_heads/refine_mask_head.py", line 561, in forward
instance_logits = self.stage_instance_logits[idx](instance_feats)[torch.arange(len(rois)), roi_labels][:, None]
IndexError: tensors used as indices must be long, byte or bool tensors
when I change the type to bool
File "/mmdet/mmdet/models/roi_heads/mask_heads/refine_mask_head.py", line 561, in forward
instance_logits = self.stage_instance_logits[idx](instance_feats)[torch.arange(len(rois)), roi_labels][:, None]
IndexError: The shape of the mask [1000] at index 0 does not match the shape of the indexed tensor [100, 1, 14, 14] at index 1
How to implement the calculation, hope you can help me!
I don't understand the dimension and type of the roi_labels
# this is forward_dummy in mmdet/models/roi_heads/standard_roi_head.py
def forward_dummy(self, x, proposals):
"""Dummy forward function."""
# bbox head
outs = ()
rois = bbox2roi([proposals])
if self.with_bbox:
bbox_results = self._bbox_forward(x, rois)
outs = outs + (bbox_results['cls_score'],
bbox_results['bbox_pred'])
# mask head
if self.with_mask:
mask_rois = rois[:100]
mask_results = self._mask_forward(x, mask_rois)
outs = outs + (mask_results['mask_pred'], )
return outs
When I use this command to test Lvis,
./scripts/dist_test.sh ./configs/refinemask/lvis/r50-refinemask-1x.py ../VOS_Model/mmdet/r50-lvis-1x.pth 8 --eval segm`
the following error occurred:
Traceback (most recent call last):
File "/mnt/SSD/RefineMask/tools/test.py", line 152, in <module>
main()
File "/mnt/SSD/RefineMask/tools/test.py", line 137, in main
outputs = multi_gpu_test(model, data_loader, args.tmpdir, args.gpu_collect)
File "/mnt/SSD/RefineMask/mmdet/apis/test.py", line 92, in multi_gpu_test
for i, data in enumerate(data_loader):
File "/home/anaconda3/envs/VOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 291, in __iter__
return _MultiProcessingDataLoaderIter(self)
File "/home/anaconda3/envs/VOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 737, in __init__
w.start()
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
Traceback (most recent call last):
File "/mnt/SSD/RefineMask/tools/test.py", line 152, in <module>
return Popen(process_obj)
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/home/anaconda3/envs/VOS/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
main()
ForkingPickler(file, protocol).dump(obj)
File "/mnt/SSD/RefineMask/tools/test.py", line 137, in main
TypeError: can't pickle _thread.RLock objects
Have you ever encountered this problem?
Why are your datasets in PNG format and different from my datasets
The first data set of coco is 4134,31817, but mine is 9,25
can you give me a download path?
sorry about my so many question
look forward your response
Hello,
I'm able to run refine mask successfully using mmcv-full 1.3.12 and mmdet - 2.6.0 on windows. It runs fine for training and giving the below error during evaluation after 12 epochs. If possible then can you help me out on this and let me know what mistake I'm doing particularly? Thank you in advance.
File "tools/test.py", line 241, in
main()
File "tools/test.py", line 206, in main
outputs = single_gpu_test(model, data_loader, args.show, args.show_dir,
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\apis\test.py", line 28, in single_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmcv\parallel\data_parallel.py", line 42, in forward
return super().forward(*inputs, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\torch\nn\parallel\data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmcv\runner\fp16_utils.py", line 98, in new_func
return old_func(*args, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\detectors\base.py", line 174, in forward
return self.forward_test(img, img_metas, **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\detectors\base.py", line 147, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\detectors\two_stage.py", line 182, in simple_test
return self.roi_head.simple_test(
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\roi_heads\standard_roi_head.py", line 264, in simple_test
segm_results = self.simple_test_mask(
File "C:\Users\anaconda3\envs\mmcv\lib\site-packages\mmdet\models\roi_heads\refine_roi_head.py", line 92, in simple_test_mask
if det_bboxes.shape[0] == 0:
AttributeError: 'list' object has no attribute 'shape'
I can‘t find this annotation file in LVIS offical website. Are this annotation generate by some methods?
nice work ! I have a question about SimpleRefineMaskHead in refine_mask_head.py
There is not mentioned in the article and README. May I ask how its performance is different from the normal model?
What changes have been made to the simple model
Looking forward to your reply!
Hello,
I'm interested in the RefineMask method and I'd like to use it to cut out the detected image from pictures.
I could train with this repository, but unfortunately, I cannot deploy the trained model to other inference engine. I'm afraid that it is because this official implementation is based on too old MMDetection code (v2.3.0).
I tried with tools/pytorch2onnx.py
but the result is failed.
I also tried with MMdeploy v0.14.0 but the result is failed too.
At last, I tried to implement RefineMask into MMDetection v2.28.2 (latest version of 2.x) and tried with MMdeploy v0.14.0. I think I could implement correctly, and it works on MMDetection v2.28.2, but converting the model is failed.
All of them can inference on the MMDetection, but it is failed when converting model with MMDeploy, at torch.jit.trace
and torch.jit.script
.
I tried with Python 3.7, PyTorch 1.13.1 and CUDA 11.7.
Are there any solution to convert RefineMask pretrained .pth
model to .onnx
model or other formats?
Or, if anyone knows, could you tell me the implementation to other train/inference platforms?
Hi according to your paper all the highest AP was collected from 3x training which are denoted as
Can we get also the pretrained weight of those models? It would be grateful if we can have one to test.
Thanks :)
Hi! I am rather eager to train your model on thePascalVOC 2012 dataset. Could you please let me know the steps for the same?
训练完毕进行测试的时候,测试展示的结果图仍让是原图,没有任何分割的结果,请问可能是什么原因造成的?
Hello,
I'm trying to run RefineMask repository. I have a question regarding this. Is it only run with torch 1.5.0 and mmcv-full 1.5.0 versions? Because I have higher version of torch (1.8.1+cu102) and mmcv-full (1.3.12) and when I run it then it gives an error mmcv 1.3.12 is used but incompatible. I try to install mmcv-full 1.5.0 but it throws an error. Can you guide me how should I proceed further?
Thank you in advance.
Good afternoon:
Congratulations on your work.
I checked that it is mandatory to use mmcv==1.0.5. However, new NVIDIA graphic cards (starting in May 2020) needs to use CUDA 11.0 at least so as to perform graphic acceleration of the algorithms.
I read several of the resolved issues and it does not seem that RefineMask could be moved outside mmcv==1.0.5. This version supports CUDA 10.2 as much but no CUDA 11.X version. Do you plan at some point in the near future to upgrade the code in order to use it in new generation graphic cards?
Thanks a lot.
Ignacio.
I got below mentioned error at 2022-02-02 05:32:05,118 - mmdet - INFO - Epoch [1][900/13322]
while training my custom coco dataset which have polygon masking in annotations. All images have segmentation data in annotations so there are no empty gt_masks instances. what should I do to solve this error and continue training?
File "/content/RefineMask/mmdet/models/roi_heads/mask_heads/refine_mask_head.py", line 284, in get_targets
instance_masks = torch.from_numpy(np.stack(instance_masks)).to(device=pos_bboxes.device, dtype=torch.float32)
File "<__array_function__ internals>", line 6, in stack
File "/usr/local/lib/python3.7/dist-packages/numpy/core/shape_base.py", line 423, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack
Hello, I build a new single-class dataset, training based on r101-refinemask-2x, and the modified configs are as follows.
However, after 93 epochs, I was stuck in some trouble. On the one hand, I found the detection results had no segmentation mask, and on the other hand, loss_instance was really difficult to converge. For example, {"mode": "train", "epoch": 94, "iter": 10, "lr": 3e-05, "time": 0.57583, "data_time": 0.2215, "memory": 5419, "loss_rpn_cls": 0.00174, "loss_rpn_bbox": 0.00334, "loss_cls": 0.01584, "acc": 99.6875, "loss_bbox": 0.03414, "loss_instance": 0.26567, "loss_semantic": 0.00546, "loss": 0.3262, "grad_norm": 2.08605}
I have two questions now:
1. Does refineMask support instance segmentation of elongated objects, or is there something wrong with my configuration file?
2. Any recommended epochs of training?
Thanks for your time.
model = dict(
type='MaskRCNN',
pretrained='torchvision://resnet101',
backbone=dict(
type='ResNet',
depth=101,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
roi_head=dict(
type='RefineRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=1,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0),
loss_bbox=dict(type='L1Loss', loss_weight=2.0)),
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=dict(
type='RefineMaskHead',
num_convs_instance=2,
num_convs_semantic=4,
conv_in_channels_instance=256,
conv_in_channels_semantic=256,
conv_kernel_size_instance=3,
conv_kernel_size_semantic=3,
conv_out_channels_instance=256,
conv_out_channels_semantic=256,
conv_cfg=None,
norm_cfg=None,
dilations=[1, 3, 5],
semantic_out_stride=4,
mask_use_sigmoid=True,
stage_num_classes=[1, 1, 1, 1],
stage_sup_size=[14, 28, 56, 112],
upsample_cfg=dict(type='bilinear', scale_factor=2),
loss_cfg=dict(
type='RefineCrossEntropyLoss',
stage_instance_loss_weight=[0.25, 0.5, 0.75, 1.0],
semantic_loss_weight=1.0,
boundary_width=2,
start_stage=1))))
train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False))
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5))
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='Resize', img_scale=(1024, 1024), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
]
test_pipeline = [
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data_root = '../coco'
data = dict(
samples_per_gpu=1,
workers_per_gpu=2,
train=dict(
type='CocoDataset',
ann_file='annotations/instances_train2017.json',
img_prefix='train2017',
pipeline=[
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='Resize', img_scale=(1024, 1024), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
],
data_root='../coco',
classes=('street', )),
val=dict(
type='CocoDataset',
ann_file='annotations/instances_train2017.json',
img_prefix='train2017',
pipeline=[
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
data_root='../coco',
classes=('street', )),
test=dict(
type='CocoDataset',
ann_file='annotations/instances_train2017.json',
img_prefix='train2017',
pipeline=[
dict(type='LoadImageFromFile', to_float32=True),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
data_root='../coco',
classes=('street', )))
evaluation = dict(metric=['bbox', 'segm'], classwise=True, interval=12)
optimizer = dict(type='SGD', lr=0.003, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
total_epochs = 1000
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[16, 22])
checkpoint_config = dict(interval=1)
log_config = dict(interval=10, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
workflow = [('train', 1)]
gpu_ids = range(0, 1)
work_dir = 'work_dirs/r101-refinemask-2x/street'
load_from = None
resume_from = 'work_dirs/r101-refinemask-2x/street/latest.pth'
classes = ('street', )
This problem occurs when running the py file in the demo.
您好,我想把可视化结果中同一类别的不同实例改成不同颜色,以及检测框和文本的颜色,请问在哪个文件中设置的,谢谢
First of all, thank you very much for sharing your work. I tested on coco dataset, but the test result showed that the whole dataset was empty. May I ask how to solve this problem? thanks.
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 18.3 task/s, elapsed: 273s, ETA: 0s
writing results to work_dirs/r50-refinemask-1x/result.pkl
Evaluating bbox...
Loading and preparing results...
The testing results of the whole dataset is empty.
load gt json
load pred json
Traceback (most recent call last):
File "/data/user-njf86/ding/RefineMask-main/tools/test.py", line 153, in
main()
File "/data/user-njf86/ding/RefineMask-main/tools/test.py", line 149, in main
dataset.evaluate(outputs, args.eval, classwise=True, **kwargs)
File "/data/user-njf86/ding/RefineMask-main/mmdet/datasets/coco.py", line 556, in evaluate
metric='segm',
File "/data/user-njf86/ding/RefineMask-main/mmdet/datasets/coco.py", line 655, in eval_cocofied_lvis_result
lvis_dt = LVISResults(lvis_gt, result_file)
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/site-packages/lvis/results.py", line 44, in init
if 'bbox' in result_anns[0]:
IndexError: list index out of range
Traceback (most recent call last):
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/site-packages/torch/distributed/launch.py", line 260, in
main()
File "/home/user-njf/anaconda3/envs/mmdet_test/lib/python3.6/site-packages/torch/distributed/launch.py", line 256, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/user-njf/anaconda3/envs/mmdet_test/bin/python', '-u', '/data/user-njf86/ding/RefineMask-main/tools/test.py', '--local_rank=0', './configs/refinemask/coco/r50-refinemask-1x.py', 'work_dirs/r50-refinemask-1x/latest.pth', '--launcher', 'none', '--eval', 'bbox', 'segm', '--out', 'work_dirs/r50-refinemask-1x/result.pkl', '--tmpdir', 'work_dirs/r50-refinemask-1x/tmpdir']' returned non-zero exit status 1.
Hi! While training on the coco dataset using the specified file I get the following error log:
train_detector(
File "/home2/kevins99/RefineMask/mmdet/apis/train.py", line 145, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in train
outputs = self.model.train_step(data_batch, self.optimizer,
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 31, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home2/kevins99/RefineMask/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/models/detectors/base.py", line 171, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/models/detectors/two_stage.py", line 150, in forward_train
rpn_losses, proposal_list = self.rpn_head.forward_train(
File "/home2/kevins99/RefineMask/mmdet/models/dense_heads/base_dense_head.py", line 58, in forward_train
proposal_list = self.get_bboxes(*outs, img_metas, cfg=proposal_cfg)
File "/home2/kevins99/RefineMask/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(*args, **kwargs)
File "/home2/kevins99/RefineMask/mmdet/models/dense_heads/anchor_head.py", line 572, in get_bboxes
proposals = self._get_bboxes_single(cls_score_list, bbox_pred_list,
File "/home2/kevins99/RefineMask/mmdet/models/dense_heads/rpn_head.py", line 167, in get_bboxes_single
dets, keep = batched_nms(proposals, scores, ids, nms_cfg)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/nms.py", line 243, in batched_nms
dets, keep = nms_op(boxes_for_nms, scores, **nms_cfg)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/utils/misc.py", line 262, in new_func
output = old_func(*args, **kwargs)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/nms.py", line 113, in nms
inds = NMSop.apply(boxes, scores, iou_threshold, offset)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/nms.py", line 17, in forward
inds = ext_module.nms(
RuntimeError: CUDA error: no kernel image is available for execution on the device
Following the mmcv documentation, I have built it again with the same CUDA version.
Link: https://mmcv.readthedocs.io/en/latest/trouble_shooting.html#:~:text=%E2%80%9CRuntimeError%3A%20nms%20is%20not%20compiled,folder%20before%20re%2Dcompile%20mmcv.
My enviorment seems to be correctly setup. The environment details are as follows:
Python: 3.8.8 (default, Apr 13 2021, 19:58:26) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0: GeForce GTX 1080 Ti
GCC: gcc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
PyTorch: 1.8.0
PyTorch compiling details: PyTorch built with:
TorchVision: 0.9.0
OpenCV: 4.5.2
MMCV: 1.0.5
MMDetection: ('2.3.0',)
MMDetection Compiler: GCC 5.5
MMDetection CUDA Compiler: 10.2
Finally, I have tried PyTorch 1.5.0 along with mmcv-full 1.0.5 which resulted in the following error log:
Traceback (most recent call last):
File "mmdet/utils/collect_env.py", line 67, in
for name, val in collect_env().items():
File "mmdet/utils/collect_env.py", line 60, in collect_env
from mmcv.ops import get_compiler_version, get_compiling_cuda_version
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/init.py", line 1, in
from .bbox import bbox_overlaps
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/ops/bbox.py", line 3, in
ext_module = ext_loader.load_ext('_ext', ['bbox_overlaps'])
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 10, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/home2/kevins99/anaconda3/envs/cvit/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: /home2/kevins99/anaconda3/envs/cvit/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _Z13__THCudaCheck9cudaErrorPKci
All mmcv modules have been built with the correct PyTorch and cuda versions using the command
pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
Help would be greatly appreciated.
Many thanks in advance!
Traceback (most recent call last):
File "tools/train.py", line 188, in
main()
File "tools/train.py", line 162, in main
model.init_weights()
File "/home/ubuntu/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmcv/runner/base_module.py", line 55, in init_weights
m.init_weights()
TypeError: init_weights() missing 1 required positional argument: 'pretrained'
Hi, @zhanggang001 ,I‘m still a beginning level learner and I have a question that don't know how to fix it.
Without changing the RefineMask config(except DataRoot and num_classes),the loss suddenly boom in epoch1 or epoch2
2021-05-25 09:52:39,591 - mmdet - INFO - Epoch [1][550/1445] lr: 2.000e-02, eta: 3:27:36, time: 0.354, data_time: 0.002, memory: 5616, loss_rpn_cls: 0.1940, loss_rpn_bbox: 0.1124, loss_cls: 0.6903, acc: 89.3125, loss_bbox: 0.7711, loss_instance: 1.7145, loss_semantic: 1.1350, loss: 4.6172, grad_norm: 49.7330
2021-05-25 09:52:58,284 - mmdet - INFO - Epoch [1][600/1445] lr: 2.000e-02, eta: 3:27:43, time: 0.374, data_time: 0.003, memory: 5616, loss_rpn_cls: 0.1819, loss_rpn_bbox: 0.1232, loss_cls: 0.7582, acc: 88.1094, loss_bbox: 0.8542, loss_instance: 1.6824, loss_semantic: 0.7272, loss: 4.3272, grad_norm: 6.2210
2021-05-25 09:53:13,881 - mmdet - INFO - Epoch [1][650/1445] lr: 2.000e-02, eta: 3:25:04, time: 0.312, data_time: 0.003, memory: 5616, loss_rpn_cls: 10.1637, loss_rpn_bbox: 6.9990, loss_cls: 34.9405, acc: 91.4102, loss_bbox: 13.7452, loss_instance: 234.7180, loss_semantic: 7664.2850, loss: 7964.8515, grad_norm: 340497.7997
2021-05-25 09:53:27,481 - mmdet - INFO - Epoch [1][700/1445] lr: 2.000e-02, eta: 3:21:08, time: 0.272, data_time: 0.003, memory: 5616, loss_rpn_cls: 35289.9099, loss_rpn_bbox: 44193.1067, loss_cls: 875153.7309, acc: 94.6094, loss_bbox: 1115768.8280, loss_instance: 22061.8204, loss_semantic: 499100.5092, loss: 2591567.9840, grad_norm: 112257859.0582
2021-05-25 09:53:41,763 - mmdet - INFO - Epoch [1][750/1445] lr: 2.000e-02, eta: 3:18:13, time: 0.286, data_time: 0.003, memory: 5616, loss_rpn_cls: 20893164722860.7344, loss_rpn_bbox: 7885642268278.1553, loss_cls: 1349368483407166.5000, acc: 86.1094, loss_bbox: 627283726521955.8750, loss_instance: 4757506602120.5703, loss_semantic: 3351921805304.0806, loss: 2013540377899481.0000, grad_norm: 117931859697361120.0000
2021-05-25 09:53:56,211 - mmdet - INFO - Epoch [1][800/1445] lr: 2.000e-02, eta: 3:15:45, time: 0.289, data_time: 0.003, memory: 5616, loss_rpn_cls: 52192068187424022801154048.0000, loss_rpn_bbox: 61156605795124256318685184.0000, loss_cls: 12385031694969291828216463360.0000, acc: 27.7930, loss_bbox: 2810178410764419893789982720.0000, loss_instance: 320446566729177349750784.0000, loss_semantic: 7193256245805316776656896.0000, loss: 15316072270766823574903717888.0000, grad_norm: inf
Thanks for your watching, I'm looking forward for your answering :)
在我阅读refinemask的测试代码时,我发现其并不支持batch>1的推理,是这样的吗
I use 8GPU training the model, the lr is set to 0.02 following your config, but the loss is nan;
when I set the lr to 0.0025 in another issue you answered, the loss is normal.
can you give me some help, thanks
Thanks for the nice work. I have a question for you. Your default is 8 cards, and you set the learning rate to 0.02. If I conduct the experiment from multiple cards to single card, does the learning rate need to be adjusted? I have tried to adjust the learning rate to 0.0025 and 0.005 for experiments, but the results are not satisfactory. May I ask if you have any suggestions on the adjustment of learning rate when switching from multiple cards to single cards? Looking forward to your reply.
Hello,
First of all thank you for sharing your work!
I tried to train a model on my own dataset (wildlife animal data) with different hyperparameters but the performance does not reach Mask R-CNN. (0.66 AP with RefineMask against 0.7 AP with Mask R CNN)
I trained it on a single gpu.
In another issue you wrote that this isnt possible, could the lower batchsize be the reason for the bad performance?
Thank you in advance!
I see the samples_per_gpu of Cityscapes is change to 2, but the learning rate is still set to 0.01. I understand the modification of samples_per_gpu of COCO dataset, for the batchsize is 16 (8 GPU * 2 samples_per_gpu) and corresponding to learning rate of 0.02. But I wonder when the samples_per_gpu of Cityscapes dataset is 2, the learning rate should also change?
hello!
thanks for your beautiful work.
I have a question that how to apply refine mask in cascade mask rcnn / HTC?
Cause there are three stage in HTC, should I apply refine mask in the last stage or all the three stage?
thanks for your reply~
在运行tools/train.py的时候
model = build_detector(cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg) 这行代码报错:
KeyError: 'RefineRoIHead is not in the head registry' 请问是什么原因呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.