carrierlxk / cosnet Goto Github PK

See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks (CVPR19)

Python 100.00%

video-segmentation segmentation attention-siamese-networks video-object-segmentation object-segmentation co-attention cvpr2019

cosnet's People

Contributors

Stargazers

Watchers

cosnet's Issues

训练阶段创建的saliency_data.txt问题

readme里写到是用来store the saliency dataset name.
但是具体内容是什么？
saliency dataset 是用的MSRA10K 和DUTS
所以里面仅仅是写数据集的名称
也就是MSRA10K 和DUTS是吗？

Can not find function ''VOCDataTestSet"

Sorry but I cannot run your code, where are the function VOCDataTestSet and VOCColorize()?

hi！bro. ，after run your traning code , I got the following errors

Traceback (most recent call last):
File "train_iteration_conf.py", line 450, in
main()
File "train_iteration_conf.py", line 375, in main
lr=args.learning_rate, momentum=args.momentum, weight_decay=args.weight_decay)
File "D:\Program Files\anaconda3\envs\pytorch\lib\site-packages\torch\optim\sgd.py", line 64, in init
super(SGD, self).init(params, defaults)
File "D:\Program Files\anaconda3\envs\pytorch\lib\site-packages\torch\optim\optimizer.py", line 50, in init
self.add_param_group(param_group)
File "D:\Program Files\anaconda3\envs\pytorch\lib\site-packages\torch\optim\optimizer.py", line 193, in add_param_group
param_group['params'] = list(params)
File "train_iteration_conf.py", line 226, in get_10x_lr_params
b.append(model.main_classifier.parameters())
File "D:\Program Files\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 535, in getattr
type(self).name, name))
AttributeError: 'CoattentionModel' object has no attribute 'main_classifier'

关于Jaccard index的计算问题

您好，在拜读论文和代码后产生几个疑问，希望能得到解答！

文中的测试指标J(Jaccard index/Region Similarity)是求分割结果和groundtruth mask的交并比，但是您的文中和代码中都表明最后的网络输出结果是一个单通道的灰度图，在每个坐标点输出的值为经过sigmoid之后的范围为0-1的值。那么这个输出结果是如何和groundtruth mask的{0,1}值计算J呢？是用一个阈值来截断么？
观察到您的网络输入图像是由cv2读入，像素值范围为[0,255]，这和一般的将像素值归一化到[0,1]后再转为tensor不同，请问您是如何考虑的，这样做有什么优点么？
希望能得到您的回复，感谢！

测试代码问题

测试fbms这个数据集时，gt和image不是一一对应的，请问是怎么解决的？

hi！dude，after run your traning code , I got the following errors:

测试输出异常

按照说明和模型文件测试了DAVIS的验证集，输出值都接近0

Some question about the training process

Hi xiankai, thanks for the sharing codes. I have some questions about the training process.

I noticed that your shared code 'train_iteration_conf.py' firstly loads the pre-trained model named 'deeplab_davis_12_0.pth'. Have you pre-trained this model from saliency datasets(MSRA10K and DUT), which means that the saliency datasets are used both in pre-training and fine-tuning (on DAVIS16)?
For the fucntion 'get_1x_lr_params' and 'get_10x_lr_params', why the setting is different by using single GPU and multiple GPUs?

Thank you

数据集文件夹结构

请问在“change davis dataset path”这个数据集文件夹里的结构是怎样的？

Question about number of reference frames

Nice work!

For Table 2 in the paper, I was wondering if number of reference frames (N) is for inference time only or for both training and inference time. Thank you.

不能找到utils.colorize_mask和data.DataLoader

您好！拜读了您的文章 See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks，非常棒的无监督解决方案。我准备测试您的代码，十分遗憾无法跑通，缺少utils.colorize_mask和data.DataLoader。您可以上传相关的代码吗？万分感谢！！！

Using Group attention get worse result

Hello, I use the siamese_model_conf_try_single.py and train_iteration_conf_group.py to train a model with two reference images, and I can only get 61.1 J-Mean. I don't modify the parameters during training. and I think siamese_model_conf_try_single.py :
A = F.softmax(A, dim = 1) #
B = F.softmax(torch.transpose(A,1,2),dim=1)
maybe:
A1 = F.softmax(A, dim = 1) #
B = F.softmax(torch.transpose(A,1,2),dim=1)

Hi! Where is the pretrained model of your MATNet?

out of memory during training

在训练阶段，我使用的是单个GPU 1080ti 显存是11GB
但是显示 RuntimeError: CUDA out of memory.
即使是把input size 从 473 调整到 378 ，依旧显示这个问题，请问有什么好的解决办法吗？

请问测试数据的文件夹的格式是什么？

(pytorch) F:\jetbrains\pycharm\COSNet>python test_coattention_conf.py --dataset davis
=====> Configure dataset and model
Namespace(batch_size=1, corp_size=(473, 473), cuda=True, data_dir='./DAVIS', data_list='./DAVIS/val_seq.txt', dataset='davis', gpus='0', ignore_label=255, img_mean=array([ 104.00698853, 116.66876984, 122.67891693], dtype=float32), input_size='473,473', maxEpoches=15, num_classes=2, restore_from='./co_attention.pth', sample_range=2, save_segimage=True, seg_save_dir='./result/test/davis_iteration_conf', seq_name='bmx-bumps', snapshot_dir='./snapshots/davis_iteration/', use_crf='True', vis_save_dir='./result/test/davis_vis')
======> test set size: 50
Traceback (most recent call last):
File "test_coattention_conf.py", line 279, in
main()
File "test_coattention_conf.py", line 168, in main
for index, batch in enumerate(testloader):
File "D:\Program Files\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 615, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "D:\Program Files\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 615, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "F:\jetbrains\pycharm\COSNet\dataloaders\PairwiseImg_test.py", line 119, in getitem
my_index = self.Index[seq_name1]
KeyError: '480p'

出现了keyError
我使用的是davis数据集。val_seqs.txt 是存放什么鸭？

When will you release the code?

Hi, this is a very impressive work, congratulations!
The davis16 test result reported on your paper is not public available on davis website . By the way, when will you release the code?

Couldn't get the same result in paper when test with your pretrained weight and crf used.

J_mean=77.5% on DAVIS16 when I use reference frame=2 with crf used, and receive same reult when reference frame=5. Why?

Some questions about training details.

Hi:
After reading your amazing work, I first downloaded your shared trained model parameters and use it to test on DAVIS16 test set.I use python version metrics on segmentation task and set the class number=2, J is foreground miou, is that right? I got J=82.47%, this result is sightly higher than that in your paper.
Then I tried to replicate your training process. Following your training settings in paper, I only got J=71.44% with co-attention module(whole COSNet) and J=70.78% without co-attention module(format as Res Deeplab in your shared code). I don't know what exactly happened during training process. Hoping you can give me some suggestion.
Here is my settings(most of them are same as yours):

Pretrianed Res Deeplab on MSRA10K with batchsize=8 and 60000 iterations. learning rate is 0.00025. (This pretrained model is useful indeed, without using it the performance is drop to 55.67% on DAVIS16 test set.)
Data preprocessing: I just use your shared code 'PairwiseImg_test.py' for training. I noticed your dataset code do not convert the sample output to Pytorch Tensor explicitly, but realize it by default method of Pytorch Dataloader method.
Training process: I use SGD optimizer and initial lr=0.00025 in your paper and layers' parameters after Res Deeplab layer4 is updated by 10x lr. For the loss I use the 2 classes nn.CrossEntropyLoss in Pytorch. I didn't use nn.BCEloss because the 'weight' augument in nn.BCEloss is used for weighted the each batch of input tensor, it is not consistent with Eq.11 in your paper. The training epoch is 100 as well.

Following settings above, I only got J=71.44% by COSNet. It is just near the performance 71.3% you reported in paper achieve by Deeplabv3. I wonder which part of my training was wrong, my co-attention module did not seem to work. I also found during my training process the loss is fluctuated a little bit sharply, and J/miou did not impove with epoch.(miou in fig is foreground background miou.)

This work is so impressive! And I'm looking forward your reply.
Thanks in advance.

How to train your deeplab pretrained model?

Hi,
I'm so excited that you released your training code and it's very helpful for me. Thank you for sharing.

I have a question about your pretrained deeplabv3 model which you called deeplab_davis_12_0.pth. I wonder how you train this model. Are you train the Res_Deeplab in your code using saliency data or davis training set or both of them? And could you give me some essential settings about train this pretrained model such as dataset, learning rate, weight decay and maxepoch. I want to reappear your pretrianed model by myself and use them as reference in training my pretrained model in my task.

Thank you so much again!

Sincerely
shichao

Could you upload pretrained weight again onto BaiduNetdisk?

The pretrained weight on BaiduNetdisk is delated. I am in mainland China. Could you upload it again onto BaiduNetdisk? Thank you!

CRF postprocessing code

Hi,
Thank you for your work! I have a question. CRF postprocessing code is not already included? In test_coattention_conf.py there is the following commented line: #import pydensecrf.densecrf as dcrf (line 32). Is it necessary to run the PyDenseCRF original code, as stated in the readme?
I look forward to your reply.

Best regards,

Monica Gruosso

About sample_range parameter

My friends and I couldn't observe any difference during by varying the --sample_range form 1 to 5 during test. Whether the released code omits some core code? Please check it.

Will you realease trainning code?

Nice work! But I didn't find the trainning script in this repo, will you release it in the future?

cannot download from GoogleDrive

I try to download from GoogleDrive but failed. Can you put a version on Baidu Pan? Thanks.

About J in paper

Is this J_mean=80.5 obtained in the paper under the condition of post-processing CRF?

why don't the test code correspondes to paper

In your paper, you use feature add-averger to merge multiple source images,but at test code, you just use averaging the probability maps. It is strange.

AttributeError: 'CoattentionModel' object has no attribute 'main_classifier'

运行train_iteration_conf.py的时候，出现错误AttributeError: 'CoattentionModel' object has no attribute 'main_classifier'？？请问这个怎么解决的？

about the attention in CoattentionModel

Could you please tell me which attention you achieved in CoattentionModel? And can you release another two attention model codes? Thanks a lot!

请问train_aug.txt这个文件夹里是什么？和训练数据集问题？

1）args.data_list = './dataset/list/VOC2012/train_aug.txt' 请问train_aug.txt这个文件夹里是什么？
2）想问下您训练的时候都用的哪些数据集呀？davis, voc2012, cityscapes都用了？还有就是cityscapes是哪个数据集呀？请问能提供下载地址吗？您在训练的时候写得Download all the training datasets, including MARA10K and DUT saliency datasets. Create a folder called images and put these two datasets into the folder，训练是用的MARA10K and DUT saliency datasets这两个数据集？

A question about the parameters

Hi， thanks for your code，but i am wonder for the parameters ：all_channel and all_dim indicate in the class CoattentionModel，and how to config it，thank you
class CoattentionModel(nn.Module):
def init(self, block, layers, num_classes, all_channel=256, all_dim=32*32): #473./8=60

When will you release the training code?

Hi, after reading your amazing work, I am very interested. When will the training code be released? Is there a definite time?

Where is the implementation of orthogonal regularization loss(Eq. 12) in your paper?

Hi, Xiankai,

Thanks very much for sharing your code!

In your paper, you mentioned that you add orthogonal regularization into your loss function (Equation 12). However, in your training code, I only find the L1 loss between prediction and ground truth (details can be found in here and here). So could you please explain why this L1 loss can achieve the effect of orthogonal regularization?

Thanks very much for your time!!!

where is saliency_data.txt

May you please upload saliency_data.txt file which is required in PairwiseImg_video.py?
Thank you.

test setting

1.保持代码原有设置参数不变，即sample-range=2和without crf，由于输出非二值，故对output1取阈值0.5，此时测试指标为78.1（使用davis16官方指标计算代码）。另取sample-range=5，精度依然78.1。为何，求教？
2.您公布的测试mask中，davis16的结果中概率图和二值图混杂，为何，求教？

测试路径问题

为什么一直有这个问题
File "test_coattention_conf.py", line 296, in
main()
File "test_coattention_conf.py", line 170, in main
db_test = db.PairwiseImg(train=False, inputRes=(473,473), db_root_dir=args.data_dir, transform=None, seq_name = None, sample_range = args.sample_range) #db_root_dir() --> '/path/to/DAVIS-2016' train path
File "/home/xiaow/code/videosaliency/COSNet-master/dataloaders/PairwiseImg_test.py", line 82, in init
images = np.sort(os.listdir(os.path.join(db_root_dir , 'JPEGImages/480p/', seq.strip('\n'))))
FileNotFoundError: [Errno 2] No such file or directory: '/JPEGImages/480p/blackswan/00000.jpg /Annotations/480p/blackswan/00000.png '
求解

Question about unsupervision

Hi, and thanks a lot for sharing your work. I am trying to train the network, and I have two questions:

You use the datasets DUTS and MSRA10K to train the 'object detection' part, correct?
And you do not need the annotations of the DAVIS dataset, right? I am confused about this code:

            if i_iter%3 ==0:
                pred1, pred2, pred3 = model(images, images)
                loss = 0.1*(loss_calc1(pred3, labels) + 0.8* loss_calc2(pred3, labels) )
                loss.backward()
                
            else:
                pred1, pred2, pred3 = model(target, search)
                loss = loss_calc1(pred1, target_gt) + 0.8* loss_calc2(pred1, target_gt) + loss_calc1(pred2, search_gt) + 0.8* loss_calc2(pred2, search_gt)#class_balanced_cross_entropy_loss(pred, labels, size_average=False)
                loss.backward()

If I understand correctly, the first part uses the DUTS + MSRA10K images and the second part the DAVIS images? But what is target_gt and search_gt? I thought the training was unsupervised.

Thanks for your help!

Segmentation result

Hi @carrierlxk,
I believe the output of the model has 2 tensors in the shape of (N_batch, 1_channel, H, W) respectively. In the code below

COSNet/test_coattention_conf.py

Line 165 in 549109d

 output_sum = output_sum + output[0].data[0,0].cpu().numpy() #分割那个分支的结果 

It only summed prediction for the first input of a batch data[0,0], and in the end for a batch, there is only one masks generated even though there are more than one input.

A BIG final LOSS value

您好，我想请问一下最终的损失值大概是多少，我最终的视频帧的损失函数在0.45左右徘徊

Training for VOC2012

Could you pls help in training the model for VOC2012 dataset ?
is this function available VOCDataSet() ?
if so , could u pls share the same?

Error after filling a val_seqs.txt

Hi,

I recently saw your work and was impressed by it. I want to run this code on my system. I came across an error and kindly request your help/guidance in resolving it.
I followed the instructions given in the repository very carefully upon which it gave me the error:-
FileNotFoundError: [Errno 2] No such file or directory: './DAVIS/val_seqs.txt'

Upon this I created a file ./DAVIS/val_seqs.txt in which I added the following address
/home/ubuntu/COSNet/DAVIS/JPEGImages/480p/bear/

After doing this, I encountered the following error:-
Traceback (most recent call last):
File "test_coattention_conf.py", line 276, in
    main()
File "test_coattention_conf.py", line 165, in main
    for index, batch in enumerate(testloader):
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in next
    batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 615, in
    batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/ubuntu/COSNet/dataloaders/PairwiseImg_test.py", line 115, in getitem
    my_index = self.Index[seq_name1]
KeyError: 'bear'

About FBMS evaluation

Thank you very much for sharing your excellent work, and I would like to ask two questions about the FBMS dataset. The first is how do you deal with multiple object sequences in the test set, and do you treat them all as foreground segmentation? Second, did you use the Official DAVIS evaluation code to evaluate the FBMS dataset?

Can you provide the source image and GT of The Youtube-Objects?

Reg training ground truth for DAVIS dataset

Hi, Thanks for releasing the code. Really helpful!

Just wanted to clarify about the lines: https://github.com/carrierlxk/COSNet/blob/master/dataloaders/PairwiseImg_video.py#L201-L202

As I understand, DAVIS dataset contains labels (with non-zero indices) of multiple objects for same frame. And during your training, you are accumulating all the annotated objects with non-zero labels into a single class. Is that correct?

about the model

def CoattentionNet(num_classes=2):
    model = CoattentionModel(Bottleneck,[3, 4, 23, 3], num_classes-1)
	
    return model

'''
the above code : why " num_classes - 1", should not num_class?

关于训练得到的co_attention.pth问题

训练得到co_attention_davis_59.pth，替换给的co_attention.pth.出错，为什么？打开之后，对比了结果，发现训练参数少了大约96行，而且后面的参数很多都是0，请问是什么原因？

Does BatchNorm make performance uncertain even worse?

Hi,
According to your paper, the batch size in training is set to 8. I also checked your GPU Nvidia TITAN Xp which memory is 12GB one device. In my own training, I notice I only can set 2 batch(2 pairs, equal to 4 images) one gpu because the deeplabv3 backbone need much GPU memory. If I set batchsize=8, I have to use 4 gpus in data parallel way. However, for BatchNorm layer it is equivalent to batchsize=2(minibatch=2). Each gpu calculate its own 2 batch tensor's norm. And according to my experiment, I use batchsize=8(minibatch=2) and 4 gpus, my test result on DAVIS16-test is unpredictable.(from ~68% to ~73%) My best result is only ~73% and it is much worse than your paper reporting. Hoping you can give me more advices. Thankyou!!!

carrierlxk / cosnet Goto Github PK

cosnet's People

Contributors

Stargazers

Watchers

Forkers

cosnet's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs