We use this repo for Part 2 and 3.
The code is based on the official implementation of the following paper:
Causal Intervention for Weakly Supervised Semantic Segmentation. Dong Zhang, Hanwang Zhang, Jinhui Tang, Xiansheng Hua, and Qianru Sun. NeurIPS, 2020. [CONTA]
zhouyang996/conta
docker run --name=multi-label --gpus all --shm-size 16G -it --mount type=bind,src=path_to_CONTA_folder,dst=/workspace zhouyang996/conta
I've preprocessed the dataset, which can be downloaded from Google Drive.
It should be put into ./segmentation/data/datasets/food_public.tar
, and then decompress it via tar -xvf food_public.tar
.
You can see two folders (JPEGImages
and org_JPEGImages
) for both training and validation data. Images in JPEGImages
are resized to have edges less than 500 pixel.
Although the dataset is stored in ./segmentation
, we run scripts in ./pseudo_mask
.
For now, the best-performing model is trained with the following commands:
python run_sample.py \
--voc12_root ../segmentation/data/datasets/food_public/img_dir/ \
--num_workers 8 \
--train_list ../segmentation/data/datasets/food_public/train_aug.txt \
--val_list ../segmentation/data/datasets/food_public/train_aug.txt \
--infer_list ../segmentation/data/datasets/food_public/train_aug.txt \
--cam_num_epoches 20 \
--irn_num_epoches 10 \
--cam_out_dir result/try_2/train/cam \
--ir_label_out_dir result/try_2/train/ir_label \
--sem_seg_out_dir result/try_2/train/sem_seg \
--ins_seg_out_dir result/try_2/train/ins_seg 2>&1 | tee try_2_train.log
It also generates pseduo_label
for the training data, under ./pseudo_mask/result/try_2/train/sem_seg
. However, it doesn't satisfy the requirements of Part 2 as the images are of different sizes.
But let's leave it there at work on Part 3 first. Using the following command to segment validation data.
python run_sample.py \
--voc12_root ../segmentation/data/datasets/food_public/validation/ \
--num_workers 8 \
--train_list ../segmentation/data/datasets/food_public/validation/to_be_predicted.txt \
--val_list ../segmentation/data/datasets/food_public/validation/to_be_predicted.txt \
--infer_list ../segmentation/data/datasets/food_public/validation/to_be_predicted.txt \
--cam_out_dir result/try_3/validation/cam \
--ir_label_out_dir result/try_3/validation/ir_label \
--sem_seg_out_dir result/try_3/validation/sem_seg \
--ins_seg_out_dir result/try_3/validation/ins_seg \
--train_cam_pass False \
--train_irn_pass False 2>&1 | tee try_3_validation.log
Segemented images are stored in ./pseudo_mask/result/try_2/validation/sem_seg
.
Then, we 'upsize' these images with python upsize.py
. It will generate new folders like ./pseudo_mask/result/try_2/validation/original_size_seg
for both training and validation set.
Then, we download the two folders and rename them as pseudo_label
and segmentation
. Done, just upload them to the system!.
Zhou: I haven't tried this part, maybe you can try :-).
This codebase only supports DeepLab v2 training which freezes batch normalization layers, although v3/v3+ protocols require training them. If training their parameters on multiple GPUs as well in your projects, please install the extra library below.
pip install torch-encoding
Batch normalization layers in a model are automatically switched in libs/models/resnet.py
.
try:
from encoding.nn import SyncBatchNorm
_BATCH_NORM = SyncBatchNorm
except:
_BATCH_NORM = nn.BatchNorm2d
This repo is mainly based on the official implementation for the following paper. If you find the code useful, please consider citing our paper using the following BibTeX entry.
@InProceedings{dong_2020_conta,
author = {Dong, Zhang and Hanwang, Zhang and Jinhui, Tang and Xiansheng, Hua and Qianru, Sun},
title = {Causal Intervention for Weakly Supervised Semantic Segmentation},
booktitle = {NeurIPS},
year = 2020
}