GithubHelp home page GithubHelp logo

toco's Introduction

Hi there 👋

toco's People

Contributors

rulixiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

toco's Issues

NaN loss

Hi, thx for your excellent work first!
I am using A100 GPU to re-produce your results, but the NaN loss still exists.
train.log

cls_labels_onehot.npy

Thank you for sharing!
Could you please show me how to generate the file cls_labels_onehot.npy ?

img_box in cam_to_label function.

image
Thank you for sharing your wonderful work. I have some questions about the specific implementation. The img_box here does not seem to be mentioned in the paper. I would like to know how img_box came about? What is its specific function? What is the effect of not using img_box in the cam_to_label function? Looking forward to your answer.

How to deal with binary classification problems?

def cam_to_roi_mask2(cam, cls_label, hig_thre=None, low_thre=None):
    b, c, h, w = cam.shape
    #pseudo_label = torch.zeros((b,h,w))
    cls_label_rep = cls_label.unsqueeze(-1).unsqueeze(-1).repeat([1,1,h,w])
    valid_cam = cls_label_rep * cam
    cam_value, _ = valid_cam.max(dim=1, keepdim=False)
    # _pseudo_label += 1
    roi_mask = torch.ones_like(cam_value, dtype=torch.int16)
    roi_mask[cam_value<=low_thre] = 0
    roi_mask[cam_value>=hig_thre] = 2

    return roi_mask
   

There is a question here. The voc data contains 20 categories. cls_label_rep = cls_label.unsqueeze(-1).unsqueeze(-1).repeat([1,1,h,w]) is a one-hot encoding that does not contain background categories. However, when I do binary classification, the one-hot encoding is used to include the background category,how should I solve it?

Reproducing Results

Hello!

Thanks for sharing this amazing work! I have a few questions regarding the steps of reproducing the results :

The numbers are written in the Table inside the README.md file does not match the results in the log attached, It's written that "Note that the final performance is post-processed with multi-scale test and CRF, while the performance in log isn't." , so my question, does the current code provides this? if not, can you refer me to the code you have used to do that?

Thanks!

NaN loss for seg_loss

Hi

I am training with VOC and I am getting NaN value for the loss seg_loss :

2023-03-05 15:39:27,915 - dist_train_voc_seg_neg.py - INFO: Iter: 400; Elasped: 0:17:32; ETA: 14:19:08; LR: 1.596e-05; cls_loss: 0.2528, cls_loss_aux: 0.2678, ptc_loss: 0.4553, ctc_loss: 1.0867, seg_loss: nan...

Any idea why this might happen ?

ImportError: /home/dancer/anaconda3/envs/new_toco/lib/python3.8/site-packages/_bilateralfilter.cpython-38-x86_64-linux-gnu.so: undefined symbol: omp_get_thread_num

I'm sorry to bother you, but after I installed rloss according to your guidance, the following errors occurred after the import of this library

import sys
sys.path.append("/home/dancer/WeakSupervision/toco/rloss/pytorch/wrapper/bilateralfilter/build/lib.linux-x86_64-3.8")
from bilateralfilter import bilateralfilter, bilateralfilter_batch
Traceback (most recent call last):
File "", line 1, in
File "/home/dancer/WeakSupervision/toco/rloss/pytorch/wrapper/bilateralfilter/bilateralfilter.py", line 15, in
import _bilateralfilter
ImportError: /home/dancer/anaconda3/envs/new_toco/lib/python3.8/site-packages/_bilateralfilter.cpython-38-x86_64-linux-gnu.so: undefined symbol: omp_get_thread_num

Dear author shows that a file is missing, is it missing?

ModuleNotFoundError: No module named 'bilateralfilter_batch'
Traceback (most recent call last):
File "scripts/dist_train_voc_seg_neg.py", line 15, in
from model.losses import get_masked_ptc_loss, get_seg_loss, CTCLoss_neg, DenseEnergyLoss, get_energy_loss
File "/home/lcl/ToCo-main/./model/losses.py", line 11, in
from bilateralfilter_batch import bilateralfilter_batch, bilateralfilter_batch
ModuleNotFoundError: No module named 'bilateralfilter_batch'
Dear author shows that a file is missing, is it missing?

teacher&student

您好,请问您代码中的teacher network和student network以及下面的代码代表什么含义呢?
self.proj_head = CTCHead(in_dim=self.encoder.embed_dim, out_dim=1024)
self.proj_head_t = CTCHead(in_dim=self.encoder.embed_dim, out_dim=1024,)两者有什么区别呢?

question about 'randomly crop the local images from the uncertain regions'

Hello! Thanks for you brilliant work in WSSS.
in camutils.py
in function of ’def crop_from_roi_neg(images, roi_mask=None, crop_num=8, crop_size=96):’
roi_index = (roi_mask[i1, margin:(h-margin), margin:(w-margin)] <= 1).nonzero() # 48:400, 48:400
Based on the code, we search for negative sample coordinates within the range of 48 to 400,If randomly selected (399,399)

h0, w0 = crop_index[i2, 0], crop_index[i2, 1] # centered at (h0, w0) temp_crops[i1, i2, ...] = images[i1, :, h0:(h0+crop_size), w0:(w0+crop_size)]
h0 = 399, w0 = 399 and crop_size = 96 , h0+cropsize = 495
495 will go out of range(0, 448)

Questions about the training divergence

When I followed your tips to train, I found that the model could not converge.
The training log is as follows:

2023-03-04 21:51:23,643 - dist_train_voc_seg_neg.py - INFO: Pytorch version: 1.10.0
2023-03-04 21:51:23,736 - dist_train_voc_seg_neg.py - INFO: GPU type: Tesla V100-SXM2-32GB
2023-03-04 21:51:23,736 - dist_train_voc_seg_neg.py - INFO: 
args: Namespace(aux_layer=-3, backbone='deit_base_patch16_224', backend='nccl', betas=(0.9, 0.999), bkg_thre=0.5, cam_scales=(1.0, 0.5, 1.5), ckpt_dir='./output/2023-03-04-21-51-23-641944/checkpoints', crop_size=448, data_folder='../VOCdevkit/VOC2012', eval_iters=2000, high_thre=0.7, ignore_index=255, list_folder='datasets/voc', local_crop_size=96, local_rank=0, log_iters=200, low_thre=0.25, lr=6e-05, max_iters=20000, momentum=0.9, num_classes=21, num_workers=10, optimizer='PolyWarmupAdamW', pooling='gmp', power=0.9, pred_dir='./output/2023-03-04-21-51-23-641944/predictions', pretrained=True, save_ckpt=False, scales=(0.5, 2), seed=0, spg=2, temp=0.5, train_set='train_aug', val_set='val', w_ctc=0.5, w_ptc=0.2, w_reg=0.05, w_seg=0.1, warmup_iters=1500, warmup_lr=1e-06, work_dir='./output/2023-03-04-21-51-23-641944', wt_decay=0.01)
2023-03-04 21:51:23,737 - distributed_c10d.py - INFO: Added key: store_based_barrier_key:1 to store for rank: 0
2023-03-04 21:51:23,737 - distributed_c10d.py - INFO: Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes.
2023-03-04 21:51:23,737 - dist_train_voc_seg_neg.py - INFO: Total gpus: 2, samples per gpu: 2...
2023-03-04 21:51:36,921 - dist_train_voc_seg_neg.py - INFO: 
Optimizer: 
PolyWarmupAdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 6e-05
    weight_decay: 0.01

Parameter Group 1
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 6e-05
    weight_decay: 0.01

Parameter Group 2
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.0006000000000000001
    weight_decay: 0.01

Parameter Group 3
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.0006000000000000001
    weight_decay: 0.01
)
2023-03-04 21:54:38,706 - dist_train_voc_seg_neg.py - INFO: Iter: 200; Elasped: 0:03:15; ETA: 5:21:45; LR: 7.960e-06; cls_loss: 0.4248, cls_loss_aux: 0.5702, ptc_loss: 0.4529, ctc_loss: 1.2007, seg_loss: 2.8564...
2023-03-04 21:57:36,865 - dist_train_voc_seg_neg.py - INFO: Iter: 400; Elasped: 0:06:13; ETA: 5:04:37; LR: 1.596e-05; cls_loss: 0.2599, cls_loss_aux: 0.2742, ptc_loss: 0.4407, ctc_loss: 1.1161, seg_loss: 2.8552...
2023-03-04 22:00:34,802 - dist_train_voc_seg_neg.py - INFO: Iter: 600; Elasped: 0:09:11; ETA: 4:56:55; LR: 2.396e-05; cls_loss: 0.2436, cls_loss_aux: 0.2581, ptc_loss: 0.4434, ctc_loss: 0.9623, seg_loss: 2.9558...
2023-03-04 22:03:32,752 - dist_train_voc_seg_neg.py - INFO: Iter: 800; Elasped: 0:12:09; ETA: 4:51:36; LR: 3.196e-05; cls_loss: 0.2465, cls_loss_aux: 0.2488, ptc_loss: 0.3674, ctc_loss: 0.6667, seg_loss: 2.8333...
2023-03-04 22:06:30,715 - dist_train_voc_seg_neg.py - INFO: Iter: 1000; Elasped: 0:15:07; ETA: 4:47:13; LR: 3.996e-05; cls_loss: 0.2464, cls_loss_aux: 0.2698, ptc_loss: 0.3084, ctc_loss: 0.4958, seg_loss: 2.8731...
2023-03-04 22:09:28,458 - dist_train_voc_seg_neg.py - INFO: Iter: 1200; Elasped: 0:18:05; ETA: 4:43:18; LR: 4.796e-05; cls_loss: 0.2479, cls_loss_aux: 0.2904, ptc_loss: 0.2686, ctc_loss: 0.4381, seg_loss: 2.6618...
2023-03-04 22:12:25,741 - dist_train_voc_seg_neg.py - INFO: Iter: 1400; Elasped: 0:21:02; ETA: 4:39:26; LR: 5.596e-05; cls_loss: 0.2594, cls_loss_aux: 0.4248, ptc_loss: 0.3134, ctc_loss: 0.5051, seg_loss: 2.4905...
2023-03-04 22:15:23,062 - dist_train_voc_seg_neg.py - INFO: Iter: 1600; Elasped: 0:24:00; ETA: 4:36:00; LR: 5.566e-05; cls_loss: 0.2557, cls_loss_aux: 0.6561, ptc_loss: 0.3222, ctc_loss: 0.8821, seg_loss: 2.0559...
2023-03-04 22:18:15,284 - dist_train_voc_seg_neg.py - INFO: Iter: 1800; Elasped: 0:26:52; ETA: 4:31:39; LR: 5.512e-05; cls_loss: nan, cls_loss_aux: nan, ptc_loss: nan, ctc_loss: nan, seg_loss: nan...
2023-03-04 22:21:06,303 - dist_train_voc_seg_neg.py - INFO: Iter: 2000; Elasped: 0:29:43; ETA: 4:27:27; LR: 5.457e-05; cls_loss: nan, cls_loss_aux: nan, ptc_loss: nan, ctc_loss: nan, seg_loss: nan...
2023-03-04 22:21:06,303 - dist_train_voc_seg_neg.py - INFO: Validating...

The following is my envs:

_libgcc_mutex             0.1                        main                                                                                 
_openmp_mutex             5.1                       1_gnu
addict                    2.4.0                    pypi_0    pypi
bilateralfilter           0.1                      pypi_0    pypi
blas                      1.0                         mkl
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2022.12.7            ha878542_0    conda-forge
certifi                   2022.12.7          pyhd8ed1ab_0    conda-forge
cudatoolkit               11.3.1              h9edb442_10    conda-forge
cycler                    0.11.0                   pypi_0    pypi
ffmpeg                    4.3                  hf484d3e_0    pytorch
fonttools                 4.38.0                   pypi_0    pypi
freetype                  2.10.4               h0708190_1    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
gnutls                    3.6.13               h85f3911_1    conda-forge
imageio                   2.9.0                    pypi_0    pypi
intel-openmp              2021.4.0          h06a4308_3561
jbig                      2.1               h7f98852_2003    conda-forge
joblib                    1.2.0                    pypi_0    pypi
jpeg                      9e                   h166bdaf_1    conda-forge
kiwisolver                1.4.4                    pypi_0    pypi
lame                      3.100             h7f98852_1001    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.38                 h1181459_1
lerc                      2.2.1                h9c3ff4c_0    conda-forge
libdeflate                1.7                  h7f98852_5    conda-forge
libffi                    3.4.2                h6a678d5_6
libgcc-ng                 11.2.0               h1234567_1
libgomp                   11.2.0               h1234567_1
libiconv                  1.17                 h166bdaf_0    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libstdcxx-ng              11.2.0               h1234567_1
libtiff                   4.3.0                hf544144_1    conda-forge
libuv                     1.43.0               h7f98852_0    conda-forge
libwebp-base              1.2.2                h7f98852_1    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
matplotlib                3.5.3                    pypi_0    pypi
mkl                       2021.4.0           h06a4308_640
mkl-service               2.4.0            py37h402132d_0    conda-forge
mkl_fft                   1.3.1            py37h3e078e5_1    conda-forge
mkl_random                1.2.2            py37h219a48f_0    conda-forge
mmcv                      1.3.8                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0
nettle                    3.6                  he412f7d_0    conda-forge
numpy                     1.21.5           py37h6c91a56_3
numpy-base                1.21.5           py37ha15fc14_3
mkl-service               2.4.0            py37h402132d_0    conda-forge
mkl_fft                   1.3.1            py37h3e078e5_1    conda-forge
mkl_random                1.2.2            py37h219a48f_0    conda-forge
mmcv                      1.3.8                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0
nettle                    3.6                  he412f7d_0    conda-forge
numpy                     1.21.5           py37h6c91a56_3
numpy-base                1.21.5           py37ha15fc14_3
olefile                   0.46               pyh9f0ad1d_1    conda-forge
omegaconf                 2.0.0                    pypi_0    pypi
opencv-python             4.7.0.72                 pypi_0    pypi
openh264                  2.1.1                h780b84a_0    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   1.1.1t               h7f8727e_0
packaging                 23.0                     pypi_0    pypi
pillow                    8.3.2            py37h0f21c89_0    conda-forge
pip                       22.3.1           py37h06a4308_0
pydensecrf                1.0rc2                   pypi_0    pypi
pyparsing                 3.0.9                    pypi_0    pypi
python                    3.7.16               h7a1cb2a_0
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.7                     2_cp37m    conda-forge
pytorch                   1.10.0          py3.7_cuda11.3_cudnn8.2.0_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pyyaml                    6.0                      pypi_0    pypi
readline                  8.2                  h5eee18b_0
scikit-learn              1.0.2                    pypi_0    pypi
scipy                     1.7.3                    pypi_0    pypi
setuptools                65.6.3           py37h06a4308_0
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.40.1               h5082296_0
texttable                 1.6.4                    pypi_0    pypi
threadpoolctl             3.1.0                    pypi_0    pypi
timm                      0.5.4                    pypi_0    pypi
tk                        8.6.12               h1ccaba5_0
torchaudio                0.10.0               py37_cu113    pytorch
torchvision               0.11.0               py37_cu113    pytorch
tqdm                      4.64.1                   pypi_0    pypi
typing_extensions         4.4.0              pyha770c72_0    conda-forge
wheel                     0.38.4           py37h06a4308_0
xz                        5.2.10               h5eee18b_1
yapf                      0.32.0                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_0
zstd                      1.5.0                ha95c52a_0    conda-forge

Do you know how to solve the problem?

The validation process of MSCOCO

Hi ! Thanks for your briliant work in WSSS. I notice that you only use part of cases of the val dataset (i.e., 5000 out of 40137) and it only takes almost 18 mins to validate the 5000 val cases. I guess it is an efficient alternative to validate the efficacy of your model by only using part of val cases . But I am wondering is it fine to do so ?

F.interpolate

Dear author:
'tools\infer_seg_voc.py'中,'segs'变量的shape是([2, 21, 22, 31]),而
segs = F.interpolate(segs, size=labels.shape[1:], mode='bilinear', align_corners=False)中size的维度是([366, 500, 3]),这样插值会造成维度错误,请问怎么解决呢?

论文图5

您好,请教一下,论文中的图5是怎么生成的?

ImageNet 21k

Hello!

In the paper results i see two results for the method, one without using ImageNet 21k pretrained weights, and the other one with, how cna i specifically to use it in the training script?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.