xialipku / emanet Goto Github PK

The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

Home Page: https://xialipku.github.io/publication/expectation-maximization-attention-networks-for-semantic-segmentation/

License: GNU General Public License v3.0

Python 99.35% Shell 0.65%

emanet's People

Contributors

Stargazers

Watchers

Forkers

juingzhou jdc08161063 firedfree george86028 bailianfa johndpope nnu-gisa hdony imatrix67 chl916185 zhwzhong jiajiemo templeblock wolfworld6 xychenunc gaimjkp damilytutu zhizhangxian boyuezhong initbin xychen9459 shahidpavis dlwbm123 yangjzx mr-yuppie belye damon-swift harrylin-hyl leeseyun huaxinxiao justcallmewilliam louisnust elle1994 lonelylouis huangwenwenlili yes7rose tqdavid hell-to-heaven playai justin0111 liuwenhaha fiveyearold zdstandup euphoria16 irentang onejune2018 peterzhousz lmnonlinear lzb863 hanbingmew 571502680 nanwangac tor4z prideduz abnerxzhe gehongpeng junshi waterbearbee xiaochengcike luhc15 fengshuanglang hvning liyongsheng-tech dz306271098 trevolan77 caozhe1011 lmm077 theluohao fdsjk zzzhoudj wanggrun zhouleidcc koala0qoo neverstoplearn berther fengxingxiang toopigtobig haoweigis youtang1993 zoonono lwzbuaa zzfpython damehou fengweie saqibmamoon wangq95 chidaidai xiaolongcheng yuzhou164 tinazzer wacquser liubindashuai222 lxmwust xrosliang cv-ip panzhangchao userr2232 jx-cheng carrierlxk pkuqgg

emanet's Issues

why does EMANet suffer from vanishing /exploding gradient even though T_train (=3) is small?

Hello,

I would like to ask the authors why does EMAnet suffer from the vanishing / exploding gradient inherent in RNNs even though the EM iterations are unrolled only for a small number (in this case 3) of steps? Vanilla RNNs with with tanh non-linearities can typically work on sequences on the order of 100 time steps, and LSTMs can work on sequences on the order of 1000 time steps.

Since the mIOU peaks at a very small value of T_train, is vanishing / exploding gradients really the reason that the mIOU deteriorates for higher values of T_train (>3)? Have the authors by any chance printed the gradient norms of every layer to check for vanishing or exploding gradients?

Thank you in advance.

How long does it take to train EMANet with a Resnet-101 backbone?

Hello,

Thank you for publishing the code to your excellent work.
I was wondering how long it takes to train the EMANet with a Resnet-101 backbone - both for when the number of input channels is 256 and 512? How many GPUs did you use to achieve this training time?

Thank you in advance :)

Fail to reproduce your result: EMANet(512)80.05%?

Hi @XiaLiPKU ! This work is wonderful and thanks so much for releasing the code.

May I ask a question? I used your pretrained model to evaluate on val set and got 80.50% mIoU using single-scale test, but when I trained this model from scratch, I can only get 79.44% finally, which is supposed to be 80.05%.

I just followed your default settings(using pretrained ResNet weights, batch size 16,4 gpus, 30k iterations and so on...).

Are there any other techniques special you adopted to get this final model?

Looking forward to your reply!

Can you provide a detailed training environment？

RuntimeError: self.net.module.load_state_dict(obj['net'])

using the pretrained model throw out this error:

RuntimeError: Error(s) in loading state_dict for EMANet:
Missing key(s) in state_dict: "extractor.4.0.conv1.weight", "extractor.4.0.bn1.weight", "extractor.4.0.bn1.bias", "extractor.4.0.bn1.running_mean", "extractor.4.0.bn1.running_var", "extractor.4.0.conv2.weight", ......
Unexpected key(s) in state_dict: "layer1.0.0.conv1.weight", "layer1.0.0.bn1.weight", "layer1.0.0.b

someone can help?

Where is the pretrained model 'Resnet152'?

Thanks for releasing the code.
Where is the pretrained model 'Resnet152'?
I am looking forward to your reply.

Fail to reproduce your results on COCO-STUFF

Hi @XiaLiPKU ! Your work is amazing, and i am appreciate that you have released your code.
May i ask question? Based on your code, i have modified your code to suit COCO-STUFF training. But i can only get 34.55% miou. I just followed your default setting, but in single gpu. ( pretrained ResNet-101, batch size 3, 30k iterations and so on...).

Looking forward to your reply!
Best wishes

How did you get the FLOPs of your models?

I have tried tools like thop (https://github.com/Lyken17/pytorch-OpCounter) yet the results are significantly different from yours. So could you please explain how did you calcualte FLOPs in details?

Is this a bug or trick? image = (image - settings.MEAN) / settings.MEAN

EMANet/dataset.py

Line 19 in 9a492d8

image = (image - settings.MEAN) / settings.MEAN

I support this line should be

image = (image - settings.MEAN) / settings.STD.

Or is this line a trick?

EMA Normalization

In 6.2.1, it says that 'From the right part of Fig. 3, it is clear to see that LN is better than no normalization'. But in the right part of Fig. 3, it seems that no normalization is better than LN. Is the figure wrong?

The difference between the code and the original paper.

Hi, thank you for releasing the code for EMANet. I find a difference between the code and the paper. The difference lies in the formulation of Equation 13 (in the paper). In the paper, the M step (bases reconstruct) is formulated as follows:
image
However, in the code, the M step is formulated as:
mu = torch.bmm(x, z_)
Actually， mu = torch.bmm(x, z_) is the weighted summation of X. However, Equation 13 (in the paper) is not the weighted summation of X. Anything wrong in the paper?

When will you release your code?

The problem about the pretrained model

Dear authors, is the model file named 'final.pth' trained on PASCAL VOC database?

no grad back propagate to EMAU.conv1?

Excuse me, I can not find the grad back to the CONV1. Are there some bugs?

where is the specific category of each pixel of the training data?

In the training process, where can I tell the specific category of each pixel of the training data? There's only object’s marginal information?

Could you release the MS+Flip test code?

Thank you！

How do you implement the equation(15) in your paper?

And how to consider the gradient backpropgation in your implement?

can't reproduce the ablation study results in figure 3, 4

Hello,

I can't seem to be able to reproduce the ablation study results in figure 3, 4 of the ICCV paper. When trained and evaluated on an iteration number of 3 (T_train = T_eval = 3), my final mIOU is 76.04%, which is 2.48% much less than the result shown in figure 4 (78.52%).

I used the default settings in settings.py except the following:

N_LAYERS = 50 (experiment done on Resnet-50)
STRIDE = 16 (for training) and 8 (for evaluation) as stated in sec. 6.1
BATCH_SIZE = 12
DEVICE = 0
DEVICES = list(range(0, 1))
NUM_WORKERS = 12

Furthermore, my Pillow version is 6.1.0 and my cv2 version is 3.4.2, unlike the version used by the authors.

Is it possible that using a single GPU to train EMANet results in such a significant decrease in the mIOU (possible due to the use of synchronized batchnorm?) or could using a different version Pillow / cv2 be the root cause of this problem?

Thanks in advance :)

The pre-trained Resnet-50 and Resnet-101 can't be downloaded

When I click the link, there comes a problem:
'This XML file does not appear to have any style information associated with it. The document tree is shown below.'
How to solve this?

分割结果边缘呈锯齿状

在使用EMANet跑VOC数据集和自己的数据集，最后的分割边缘都有明显的锯齿状，请问你们的结果也是这样的吗？

EMAU：mu = self._l2norm(mu, dim=1)，can i remove this? and just use kaiming normal?

EMANet/network.py

Line 200 in f7d7b47

mu = self._l2norm(mu, dim=1)

The result to Image

Hi,
How can I put the net output to a Image?

The training process was stopped

Dear XiaLiPKU,
I clone your codes and followed the step to train in my servers, but there were just three lines:

2020-03-13 21:29:17,166 - INFO - set log dir as ./logdir
2020-03-13 21:29:17,166 - INFO - set model dir as ./models
2020-03-13 21:29:19,127 - ERROR - No checkpoint ./models/latest.pth!>

I know that error will not influence my training process. but there were no models saved in the ./models and when I run "sh tensorboard.sh", there was nothing. It seems that the training process was stopped. I just replace obj.cuda(async=True) with obj.cuda(non_blocking=True), then I didn't change any codes. Could you help me?

Thanks!

Questions about parameters and FLOPs in Tab. 1.

You note "All results are achieved with the backbone ResNet-101 with output stride 8". Therefore, why the parameters and FLOPs of EMANet are substantially less than the backbone (ResNet-101)? Taking EMANet512 as an example, it contains 10M parameters and 43.1G FLOPs. However, the backbone (ResNet-101) network totally contains 42.6M parameters and 190.6G FLOPs. Are there some errors in this place?

for the pretrained ResNet152

Hi,
Can you provide the pretrained ResNet152 model?
Thanks!

Could you please tell me where is the code to put the input data in GPU?

I only find you put the model in gpu......thx

Will 'mu' be updated during evaluation?

It seems that ’mu‘ will be updated during evaluation, in other worlds, it will record some information in test set? It's ok or should be banned?

How can eval.py generate results of segmentation?

When running eval.py only appears mIoU, what should I do to get the segmentation maps?

How to train coco stuff dataset on EMANet?

moving averaging operation

The moving averaging operation can also be writtern in EMAU class?

question about bn layer

i am puzzled with the bn layer,in your code ,u did not use torch.nn.batchnorm2d ,What's the difference between the torch.bn with synchronizedbn2d

Fail to reproduce your result

I meet the same problem as #22 , would you please provide you PIL and cv2 version?

Can't train the model?

(base) davis@davis-MS-7B17:~/Network/EMANet-master$ python train.py
2019-08-31 13:50:14,703 - INFO - set log dir as ./logdir
2019-08-31 13:50:14,703 - INFO - set model dir as ./models
2019-08-31 13:50:17,131 - ERROR - No checkpoint ./models/latest.pth!

The Training step is stopped, so I have to Keyboard Interrupt it...
Does anybody know how to solve it?

About the BN!!!

I add other block to replace EMAU, but get some warning. I guess it's bn_lib you used not suitable for my block.

`2020-07-30 20:05:26,727 - INFO - step: 1 loss: 2.429 lr: 0.009

WARNING batched routines are designed for small sizes. It might be better to use the
Native/Hybrid classical routines if you want good performance.

=========================================================================================
WARNING batched routines are designed for small sizes. It might be better to use the
Native/Hybrid classical routines if you want good performance.

2020-07-30 20:05:29,586 - INFO - step: 2 loss: 2.398 lr: 0.009`

ResNet18 pretrained model

Hi,

Thanks for providing the pre-trained ResNet50 and ResNet101 models.
Do you have the pre-trained ResNet18 model that replaces the first 7x7 Conv to three 3x3 Conv?
I have surfed it for a long time but unfortunately, I didn't find it. If you have saved this model, could you please share it with me?
Many thanks in advance.

期待开源

Hello, may I ask how not to apply your pre-training model from scratch training?

FLOPs

How to compute the FLOPs and parameters of your EMA module?
Could you please share the computing details? Thanks!

No latest.pth!

HI, Thanks for the great repo. I can not get latest.pth in training. What should I do?　
error:
ERROR - No checkpoint ./models/latest.pth!
Thank you.

selection of K

In my opinion, besides T, the selection of K is also important (like in GMM or k-means). I didn't see any ablation study on the effect of different K's, did you do some experiments?

Intuitively, I have the impression that mu represents different features for different classes, so the first K I would try is the number of classes (e.g. 19 for Cityscapes). Can you explain how you decide to use K=64?

As the visualization of responsibility shows, different z's tend to represent different classes, so won't it happen that having K>number of class makes some z's be actually close to each other, making them eventually redundant?

Thanks.

about visualization on attention maps

Hi, @XiaLiPKU ,

Could you also provide the code for visualizing on attention maps, like the responses Z in Fig.5?

THX!

Why use padding for ValDataset?

Hey, I found the ValDataset used padding for both image and label.

#image, label = pad_inf(image, label)

def pad_inf(image, label=None):
    h, w = image.size()[-2:] 
    stride = settings.STRIDE
    pad_h = (stride + 1 - h % stride) % stride
    pad_w = (stride + 1 - w % stride) % stride
    if pad_h > 0 or pad_w > 0:
        image = F.pad(image, (0, pad_w, 0, pad_h), mode='constant', value=0.)
        if label is not None:
            label = F.pad(label, (0, pad_w, 0, pad_h), mode='constant', 
                          value=settings.IGNORE_LABEL)
    return image, label

Can you show the reasons for doing so?

about voc12

hi, I submitted the results of the val set and the test set to the official website for testing, but the two results differ by four points. How can I reduce this gap.

could run in one gpu

(base) pf@pf-System-Product-Name:~/EMANet$ python train.py
2019-12-06 21:37:49,527 - INFO - set log dir as ./logdir
2019-12-06 21:37:49,528 - INFO - set model dir as ./models
Traceback (most recent call last):
File "train.py", line 181, in
main()
File "train.py", line 146, in main
sess = Session(dt_split='trainaug')
File "train.py", line 93, in init
self.net = DataParallel(self.net, device_ids=settings.DEVICES)
File "/home/pf/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 131, in init
_check_balance(self.device_ids)
File "/home/pf/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 18, in _check_balance
dev_props = [torch.cuda.get_device_properties(i) for i in device_ids]
File "/home/pf/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 18, in
dev_props = [torch.cuda.get_device_properties(i) for i in device_ids]
File "/home/pf/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py", line 301, in get_device_properties
raise AssertionError("Invalid device id")
AssertionError: Invalid device id

Can you open source code of the visual code?

Without visualization results, it is difficult for us to understand the paper。If you are free，you can do it！

is this a bug?

in train.py line 134 and 135
self.net.module.ema.mu *= momentum self.net.module.ema.mu += mu * (1 - momentum)

it maybe like this
self.net.module.emau.mu *= momentum self.net.module.emau.mu += mu * (1 - momentum)

And have you done any experiments with different λ in equation(12)？What kind of effect will it have?

I had some trouble，could you help me?

Thanks for your reply！！！
According to your ground truth，I made the ground truth of my dataset .But during the training, there was a problem，which I've compiled below. Emmmm, Can you help me? Maybe my dataset is too messy, and their boundaries are not obvious.What advice would you offer to me?

RuntimeError: CUDA error: an illegal memory access was encountered
terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: an illegal memory access was encountered (insert_events at /pytorch/c10/cuda/CUDACachingAllocator.cpp:564)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f5345247441 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f5345246d7a in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: + 0x13652 (0x7f534261a652 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0x50 (0x7f5345237ce0 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #4: + 0x30facb (0x7f52f071aacb in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #5: + 0x376d60 (0x7f52f0781d60 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #6: + 0x3128ea (0x7f52f071d8ea in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #7: torch::autograd::deleteFunction(torch::autograd::Function*) + 0xa2 (0x7f52f071d9a2 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #8: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0xa2 (0x7f5330b81bb2 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #9: + 0x14216b (0x7f5330ba516b in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #10: + 0x1421d9 (0x7f5330ba51d9 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #11: torch::autograd::Variable::Impl::release_resources() + 0x1b (0x7f52f0d5708b in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #12: + 0x1420bb (0x7f5330ba50bb in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #13: + 0x3c30f4 (0x7f5330e260f4 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #14: + 0x3c3141 (0x7f5330e26141 in /home/r/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #15: + 0x19aa5e (0x55791a64ba5e in /home/r/.conda/envs/pytorch/bin/python3)
frame #16: + 0xf1b77 (0x55791a5a2b77 in /home/r/.conda/envs/pytorch/bin/python3)
frame #17: + 0xf1a07 (0x55791a5a2a07 in /home/r/.conda/envs/pytorch/bin/python3)
frame #18: + 0xf1a1d (0x55791a5a2a1d in /home/r/.conda/envs/pytorch/bin/python3)
frame #19: + 0xf1a1d (0x55791a5a2a1d in /home/r/.conda/envs/pytorch/bin/python3)
frame #20: PyDict_SetItem + 0x3da (0x55791a5e963a in /home/r/.conda/envs/pytorch/bin/python3)
frame #21: PyDict_SetItemString + 0x4f (0x55791a5f065f in /home/r/.conda/envs/pytorch/bin/python3)
frame #22: PyImport_Cleanup + 0x99 (0x55791a655d89 in /home/r/.conda/envs/pytorch/bin/python3)
frame #23: Py_FinalizeEx + 0x61 (0x55791a6c0231 in /home/r/.conda/envs/pytorch/bin/python3)
frame #24: Py_Main + 0x35e (0x55791a6ca57e in /home/r/.conda/envs/pytorch/bin/python3)
frame #25: main + 0xee (0x55791a59488e in /home/r/.conda/envs/pytorch/bin/python3)
frame #26: __libc_start_main + 0xf0 (0x7f5348fdd830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #27: + 0x1c3160 (0x55791a674160 in /home/r/.conda/envs/pytorch/bin/python3)

Aug, 22, 2018?

ha ha

Output stride

Hi, first of all thanks for your paper.
You mention that for some nets the stride is 16 while for other 8. However, there is nothing on how do you recover it back to the original size. Do you use bi-linear upsampling? If yes, don't have a problem with borders and fine structures for using such a steep upsampling method?

xialipku / emanet Goto Github PK

emanet's People

Contributors

Stargazers

Watchers

Forkers

emanet's Issues

`2020-07-30 20:05:26,727 - INFO - step: 1 loss: 2.429 lr: 0.009

WARNING batched routines are designed for small sizes. It might be better to use the Native/Hybrid classical routines if you want good performance.

========================================================================================= WARNING batched routines are designed for small sizes. It might be better to use the Native/Hybrid classical routines if you want good performance.

========================================================================================= WARNING batched routines are designed for small sizes. It might be better to use the Native/Hybrid classical routines if you want good performance.

========================================================================================= WARNING batched routines are designed for small sizes. It might be better to use the Native/Hybrid classical routines if you want good performance.

Recommend Projects

Recommend Topics

Recommend Org

Jobs

WARNING batched routines are designed for small sizes. It might be better to use the
Native/Hybrid classical routines if you want good performance.

=========================================================================================
WARNING batched routines are designed for small sizes. It might be better to use the
Native/Hybrid classical routines if you want good performance.

=========================================================================================
WARNING batched routines are designed for small sizes. It might be better to use the
Native/Hybrid classical routines if you want good performance.

=========================================================================================
WARNING batched routines are designed for small sizes. It might be better to use the
Native/Hybrid classical routines if you want good performance.