About loss_ts, loss_corr

We quit maintaining this project. Please check our new work, Mask Auto-labeler for more powerful models

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Paper | Blog | Demo (Youtube) | Demo (Bilibili)

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.
Shiyi Lan, Zhiding Yu, Chris Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry Davis, Anima Anandkumar
International Conference on Computer Vision (ICCV) 2021

This repository contains the official Pytorch implementation of training & evaluation code and pretrained models for DiscoBox. DiscoBox is a state of the art framework that can jointly predict high quality instance segmentation and semantic correspondence from box annotations.

We use MMDetection v2.10.0 as the codebase.

All of our models are trained and tested using automatic mixed precision, which leverages float16 for speedup and less GPU memory consumption.

Installation

This implementation is based on PyTorch==1.9.0, mmcv==1.3.13, and mmdetection==2.10.0

Please refer to get_started.md for installation.

Or you can download the docker image from our dockerhub repository.

Models

Results on COCO val 2017

Backbone	Weights	AP	AP@50	AP@75	AP@Small	AP@Medium	AP@Large
ResNet-50	download	30.7	52.6	30.6	13.3	34.1	45.6
ResNet-101-DCN	download	35.3	59.1	35.4	16.9	39.2	53.0
ResNeXt-101-DCN	download	37.3	60.4	39.1	17.8	41.1	55.4

Results on COCO test-dev

We also evaluate the models in the section Results on COCO val 2017 with the same weights on COCO test-dev.

Backbone	Weights	AP	AP@50	AP@75	AP@Small	AP@Medium	AP@Large
ResNet-50	download	32.0	53.6	32.6	11.7	33.7	48.4
ResNet-101-DCN	download	35.8	59.8	36.4	16.9	38.7	52.1
ResNeXt-101-DCN	download	37.9	61.4	40.0	18.0	41.1	53.9

Training

COCO

ResNet-50 (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py 8

ResNet-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py 8

ResNeXt-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x.py 8

Pascal VOC 2012

ResNet-50 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_6x.py 4

ResNet-101 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_6x.py 4

Testing

COCO

ResNet-50 (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py \
     work_dirs/coco_r50_fpn_3x.pth 8 --eval segm

ResNet-101-DCN (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py \
     work_dirs/coco_r101_dcn_fpn_3x.pth 8 --eval segm

ResNeXt-101-DCN (GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x_fp16.py \
     work_dirs/coco_x101_dcn_fpn_3x.pth 8 --eval segm

Box-conditioned inference

You can use DiscoBox for autolabeling given images and tight bounding boxes. We call this box-conditioned inference. Here is an example of box-conditioned inference on COCO val2017 with x101_dcn_fpn arch:

bash tools/dist_test.sh \
     config/discobox/boxcond_discobox_solov2_x101_dcn_fpn_3x.py \
     work_dirs/x101_dcn_fpn_coco_3x.pth 8 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/coco_x101_dcn_fpn_results.json"

Pascal VOC 2012 (COCO API)

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 --eval segm

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 --eval segm

Pascal VOC 2012 (Matlab)

Step 1: generate results

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r50_results.json"

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r101_results.json"

Step 2: format conversion

ResNet-50:

python tools/json2mat.py work_dirs/voc_r50_results.json work_dirs/voc_r50_results.mat

ResNet-101:

python tools/json2mat.py work_dirs/voc_r101_results.json work_dirs/voc_r101_results.mat

Step 3: evaluation

Please visit BBTP for the evaluation code written in Matlab.

PF-Pascal

Please visit this repository.

Visualization

ResNeXt-101

python tools/test.py configs/discobox/discobox_solov2_x101_dcn_fpn_3x.py coco_x101_dcn_fpn_3x.pth --show --show-dir discobox_vis_x101

LICENSE

Please check the LICENSE file. DiscoBox may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{lan2021discobox,
  title={DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision},
  author={Lan, Shiyi and Yu, Zhiding and Choy, Christopher and Radhakrishnan, Subhashree and Liu, Guilin and Zhu, Yuke and Davis, Larry S and Anandkumar, Anima},
  journal={arXiv preprint arXiv:2105.06464},
  year={2021}
}

	class MeanField(nn.Module):

	# feature map (RGB)
	# B = #num of object
	# shape of [N 3 H W]

	#@autocast(enabled=False)
	def __init__(self, feature_map, kernel_size=3, require_grad=False, theta0=0.5, theta1=30, theta2=10, alpha0=3,
	iter=20, base=0.45, gamma=0.01):
	super(MeanField, self).__init__()
	self.require_grad = require_grad
	self.kernel_size = kernel_size
	with torch.no_grad():
	self.unfold = torch.nn.Unfold(kernel_size, stride=1, padding=kernel_size//2)
	feature_map = feature_map + 10
	unfold_feature_map = self.unfold(feature_map).view(feature_map.size(0), feature_map.size(1), kernel_size**2, -1)
	self.feature_map = feature_map
	self.theta0 = theta0
	self.theta1 = theta1
	self.theta2 = theta2
	self.alpha0 = alpha0
	self.gamma = gamma
	self.base = base
	self.spatial = torch.tensor((np.arange(kernel_size2)//kernel_size - kernel_size//2) 2 +\
	(np.arange(kernel_size2) % kernel_size - kernel_size//2) 2).to(feature_map.device).float()

	self.kernel = alpha0 * torch.exp((-(unfold_feature_map - feature_map.view(feature_map.size(0), feature_map.size(1), 1, -1)) ** 2).sum(1) / (2 * self.theta0 ** 2) + (-(self.spatial.view(1, -1, 1) / (2 * self.theta1 ** 2))))
	self.kernel = self.kernel.unsqueeze(1)

	self.iter = iter

	# input x
	# shape of [N H W]
	#@autocast(enabled=False)
	def forward(self, x, targets, inter_img_mask=None):
	with torch.no_grad():
	x = x * targets
	x = (x > 0.5).float() * (1 - self.base*2) + self.base
	U = torch.cat([1-x, x], 1)
	U = U.view(-1, 1, U.size(2), U.size(3))
	if inter_img_mask is not None:
	inter_img_mask.reshape(-1, 1, inter_img_mask.shape[2], inter_img_mask.shape[3])
	ret = U
	for _ in range(self.iter):
	nret = self.simple_forward(ret, targets, inter_img_mask)
	ret = nret
	ret = ret.view(-1, 2, ret.size(2), ret.size(3))
	ret = ret[:,1:]
	ret = (ret > 0.5).float()
	count = ret.reshape(ret.shape[0], -1).sum(1)
	valid = (count >= ret.shape[2] * ret.shape[3] * 0.05) * (count <= ret.shape[2] * ret.shape[3] * 0.95)
	valid = valid.float()
	return ret, valid

	#@autocast(enabled=False)
	def simple_forward(self, x, targets, inter_img_mask):
	h, w = x.size(2), x.size(3)
	unfold_x = self.unfold(-torch.log(x)).view(x.size(0)//2, 2, self.kernel_size**2, -1)
	aggre = (unfold_x * self.kernel).sum(2)
	aggre = aggre.view(-1, 1, h, w)
	f = torch.exp(-aggre)
	f = f.view(-1, 2, h, w)
	if inter_img_mask is not None:
	f += inter_img_mask * self.gamma
	f[:, 1:] *= targets
	f = f + 1e-6
	f = f / f.sum(1, keepdim=True)
	f = (f > 0.5).float() * (1 - self.base*2) + self.base
	f = f.view(-1, 1, h, w)

	return f

nvlabs / discobox Goto Github PK

discobox's Introduction

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Paper | Blog | Demo (Youtube) | Demo (Bilibili)

Installation

Models

Results on COCO val 2017

Results on COCO test-dev

Training

COCO

Pascal VOC 2012

Testing

COCO

Box-conditioned inference

Pascal VOC 2012 (COCO API)

Pascal VOC 2012 (Matlab)

PF-Pascal

Visualization

LICENSE

Citation

discobox's People

Contributors

Stargazers

Watchers

Forkers

discobox's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs