shrubb / box-convolutions Goto Github PK

View Code? Open in Web Editor NEW

509.0 509.0 35.0 106 KB

PyTorch code for the "Deep Neural Networks with Box Convolutions" paper

License: Apache License 2.0

Python 19.11% C++ 46.87% Cuda 34.02%

box-convolutions's People

Contributors

Stargazers

Watchers

box-convolutions's Issues

Code for paper

Hi,

May I know when you will release the codes used in your paper? I mean the example code for semantic segmentation. I am looking forward to trying it.

CPU support ?

Thanks for the code,

I opened an issue for the following reason.

Fails to install in CPU only environment, here is the complete traceback:

I just want to check the input-output with very low dimension tensors and check the working of the model. Is it possible to get a CPU support for this?

(Torch) C:\Users\as\Documents\Workspace\Experiments\BoxCon\box-convolutions>python -m pip install .
Processing c:\users\as\documents\workspace\experiments\boxcon\box-convolutions
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\AKASHS~1\AppData\Local\Temp\pip-req-build-amptjpk7\setup.py", line 2, in
import torch.utils.cpp_extension
File "C:\Users\as\Anaconda3\envs\Torch\lib\site-packages\torch\utils\cpp_extension.py", line 61, in
CUDA_HOME = _find_cuda_home()
File "C:\Users\as\Anaconda3\envs\Torch\lib\site-packages\torch\utils\cpp_extension.py", line 30, in _find_cuda_home
if not os.path.exists(cuda_home):
File "C:\Users\as\Anaconda3\envs\Torch\lib\genericpath.py", line 19, in exists
os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not list

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in C:\Users\AKASHS~1\AppData\Local\Temp\pip-req-build-amptjpk7\

about box conv func

In box_convolution_function.py, return input ** 2.
I cannot run the test correctly.
Is the current version executable? or box_convolution.cpp to be updated?

Implementation in VGG

Hey,

I am trying to implement box convolution for HED (Holistically-Nested Edge Detection) which uses VGG architecture. Here's the architecture with box convolution layer:

class HED(nn.Module):
    def __init__(self):
        super(HED, self).__init__()

        # conv1
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 64, 3, padding=1),
            BoxConv2d(1, 64, 5, 5),
            nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, 3, padding=1),
            #BoxConv2d(1, 64, 28, 28),
            nn.ReLU(inplace=True),
        )

        # conv2
        self.conv2 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/2
            nn.Conv2d(64, 128, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        # conv3
        self.conv3 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/4
            nn.Conv2d(128, 256, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        # conv4
        self.conv4 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/8
            nn.Conv2d(256, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        # conv5
        self.conv5 = nn.Sequential(
            nn.MaxPool2d(2, stride=2, ceil_mode=True),  # 1/16
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(512, 512, 3, padding=1),
            nn.ReLU(inplace=True),
        )

        self.dsn1 = nn.Conv2d(64, 1, 1)
        self.dsn2 = nn.Conv2d(128, 1, 1)
        self.dsn3 = nn.Conv2d(256, 1, 1)
        self.dsn4 = nn.Conv2d(512, 1, 1)
        self.dsn5 = nn.Conv2d(512, 1, 1)
        self.fuse = nn.Conv2d(5, 1, 1)

    def forward(self, x):
        h = x.size(2)
        w = x.size(3)

        conv1 = self.conv1(x)
        conv2 = self.conv2(conv1)
        conv3 = self.conv3(conv2)
        conv4 = self.conv4(conv3)
        conv5 = self.conv5(conv4)

        ## side output
        d1 = self.dsn1(conv1)
        d2 = F.upsample_bilinear(self.dsn2(conv2), size=(h,w))
        d3 = F.upsample_bilinear(self.dsn3(conv3), size=(h,w))
        d4 = F.upsample_bilinear(self.dsn4(conv4), size=(h,w))
        d5 = F.upsample_bilinear(self.dsn5(conv5), size=(h,w))

        # dsn fusion output
        fuse = self.fuse(torch.cat((d1, d2, d3, d4, d5), 1))

        d1 = F.sigmoid(d1)
        d2 = F.sigmoid(d2)
        d3 = F.sigmoid(d3)
        d4 = F.sigmoid(d4)
        d5 = F.sigmoid(d5)
        fuse = F.sigmoid(fuse)

        return d1, d2, d3, d4, d5, fuse

I get the following error:
RuntimeError: BoxConv2d: all parameters must have as many rows as there are input channels (box_convolution_forward at src/box_convolution_interface.cpp:30)

Can you help me with this?

Import Error

Success build with ubuntu 16.04, cuda 10 and gcc 7.4.
But import error encountered:

In [1]: import torch

In [2]: from box_convolution import BoxConv2d

ImportError                               Traceback (most recent call last)
<ipython-input-2-2424917dbf01> in <module>()
----> 1 from box_convolution import BoxConv2d

~/Software/pkgs/box-convolutions/box_convolution/__init__.py in <module>()
----> 1 from .box_convolution_module import BoxConv2d

~/Software/pkgs/box-convolutions/box_convolution/box_convolution_module.py in <module>()
      2 import random
      3 
----> 4 from .box_convolution_function import BoxConvolutionFunction, reparametrize
      5 import box_convolution_cpp_cuda as cpp_cuda
      6 

~/Software/pkgs/box-convolutions/box_convolution/box_convolution_function.py in <module>()
      1 import torch
      2 
----> 3 import box_convolution_cpp_cuda as cpp_cuda
      4 
      5 def reparametrize(

ImportError: /usr/Software/anaconda3/lib/python3.6/site-packages/box_convolution_cpp_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration

@shrubb

BoxConv1D

Hi, Great work. Thanks for sharing.

Could you give me some hints for modifying it to a boxconv1D, please?

how box convolution works

Hi,

It is a nice work.
In the first figure on your poster, you compared the 3x3 convolution layer and your box convolution layer. I am not clear how the box convolution works. Is it right that for each position (p,q) on the image, you use a box filter which has a relative position x, y to (p,q) and size w,h to calculate the value for (p,q) on the output? You learn the 4 parameters x, y, w, h for each box filter. For example, in the figure, the value for the red anchor pixel position on the output should be the sum of the values in the box. Is it correct? Thanks.

Multi-GPU Training: distributed error encountered

I am using https://github.com/facebookresearch/maskrcnn-benchmark for object detection, I want to use box convolutions, when I add a box convolution after some layer, training with 1 GPU is OK, while training with multiple GPUs in distributed mode failed, the error is very similar to this issue, I do not know how to fix, have some ideas? @shrubb

2019-02-18 16:09:15,187 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
  File "tools/train_net.py", line 172, in <module>
    main()
  File "tools/train_net.py", line 165, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 74, in train
    arguments,
  File "/srv/data0/hzxubinbin/projects/maskrcnn_benchmark/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 79, in do_train
    losses.backward()
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/tensor.py", line 102, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 445, in distributed_data_parallel_hook
    self._queue_reduction(bucket_idx)
  File "/home/hzxubinbin/anaconda3.1812/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 475, in _queue_reduction
    self.device_ids)
TypeError: _queue_reduction(): incompatible function arguments. The following argument types are supported:
    1. (process_group: torch.distributed.ProcessGroup, grads_batch: List[List[at::Tensor]], devices: List[int]) -> Tuple[torch.distributed.Work, at::Tensor]

Invoked with: <torch.distributed.ProcessGroupNCCL object at 0x7f0d95248148>, [[tensor([[[[0.]],

1 GPU is too slow, I want to use multiple GPUs

Did you try you impl BoxConv on Unet(Resnet encoder)?

subj
How to set max height, max width?
Single-process box gen takes too long.

Getting a cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.ci:250

Hello,

I've been trying to implement your box convolution layer on a ResNet model by just substituting your BottleneckBoxConv layers for a typical ResNet Bottleneck layer.

I was getting this error

THCudaCheck FAIL file=src/box_convolution_cuda_forward.cu line=250 error=9 : invalid configuration argument
Traceback (most recent call last):
  File "half_box_train.py", line 178, in <module>
    main()
  File "half_box_train.py", line 107, in main
    scores = res_net(x)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dkang/Project/cs231n_project_box_convolution/models/HalfBoxResNet.py", line 331, in forward
     x = self.layer3(x)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
       input = module(input)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dkang/Project/cs231n_project_box_convolution/models/HalfBoxResNet.py", line 66, in forward
    return F.relu(x + self.main_branch(x), inplace=True)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/box_convolution/box_convolution_module.py", line 222, in forward
    self.reparametrization_h, self.reparametrization_w, self.normalize, self.exact)
  File "/opt/anaconda3/lib/python3.7/site-packages/box_convolution/box_convolution_function.py", line 46, in forward
    input_integrated, x_min, x_max, y_min, y_max, normalize, exact)
RuntimeError: cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.cu:250

Thanks so much!

Test script failed

Test script assertion failed:

Random seed is 1546545757
Testing for device 'cpu'
Running test_integral_image()...
100%|| 50/50 [00:00<00:00, 1491.15it/s]
OK
Running test_box_convolution_module()...
0%|
python3: /pytorch/third_party/ideep/mkl-dnn/src/cpu/jit_avx2_conv_kernel_f32.cpp:567: static mkldnn::impl::status_t mkldnn::impl::cpu::jit_avx2_conv_fwd_kernel_f32::init_conf(mkldnn::impl::cpu::jit_conv_conf_t&, const convolution_desc_t&, const mkldnn::impl::memory_desc_wrapper&, const mkldnn::impl::memory_desc_wrapper&, const mkldnn::impl::memory_desc_wrapper&, const primitive_attr_t&): Assertion `jcp.ur_w * (jcp.nb_oc_blocking + 1) <= num_avail_regs' failed.
Aborted (core dumped)

Configuration: Ubuntu 16.04 LTS, CUDA 9.2, PyTorch 1.1.0, GCC 5.4.0.

Compared to STN

How is box convolutions different than Spatial Transformer networks? Isn't that also a way of generating boxes of interest?

Supplemental zip for the paper

Hi,
The NIPS archive for the paper gives a 404 for the supplemental material link. supplemental

Can you please provide the zip?

About the speed of box convolution

In the paper, you replace every second block in ENet with your proposed box conv block. But I find if not using cuda, box block have almost the same speed with the original block, but if using cuda, the box block is much slower than the original block. Is that right? And why?
The ENet original block is from PyTorch-ENet. I define your proposed block is as following:

class BoxBlock(nn.Module):
    """
    The block architecture that is used to embed box convolution into Enet.
    """

    def __init__(self, channels, input_size=(512, 512), dropout_prob=0):
        super(BoxBlock, self).__init__()
        self.channels = channels

        self.conv1x1 = nn.Sequential(nn.Conv2d(self.channels, self.channels // 4, kernel_size=1, bias=False),
                                     nn.BatchNorm2d(self.channels // 4),
                                     nn.ReLU(inplace=True))

        w, h = input_size
        self.box_conv = nn.Sequential(BoxConv2d(in_planes=self.channels // 4,
                                                num_filters=4, max_input_h=h, max_input_w=w),
                                      nn.BatchNorm2d(self.channels),
                                      nn.Dropout2d(p=dropout_prob))

        self.relu = nn.ReLU(inplace=True)

    def forward(self, input):
        x = self.conv1x1(input)
        x = self.box_conv(x)
        x = x + input
        return self.relu(x)

I use the following test code.

    cuda = False

    input = torch.randn(1, 128, 64, 64)
    if cuda:
        input = input.cuda()

    ori_block = RegularBottleneck(128, dilation=8, padding=8, dropout_prob=0.1, relu=True)
    box_block = BoxBlock(channels=128, input_size=(64, 64), dropout_prob=0.25)

    if cuda:
        ori_block = ori_block.cuda()
        box_block = box_block.cuda()

    end = time.time()
    y = ori_block(input)
    end = time.time() - end

    if cuda:
        print('if cuda, ori block run time = ', end)
    else:
        print('if no cuda, ori block run time = ', end)

    end = time.time()
    y = box_block(input)
    end = time.time() - end

    if cuda:
        print('if cuda, box block run time = ', end)
    else:
        print('if no cuda, box block run time = ', end)

The test result is:

Build Problem Windows 10 CUDA10.1 Python Bindings?

Hi,
I'm trying to compile the box-convolutions using Windows 10 with CUDA 10.1.
This results in the following error:

\python\python36\lib\site-packages\torch\lib\include\pybind11\cast.h(1439): error: expression must be a pointer to a complete object type

  1 error detected in the compilation of "C:/Users/CHRIST~1/AppData/Local/Temp/tmpxft_000010ec_00000000-8_integral_image_cuda.cpp4.ii".
  integral_image_cuda.cu
  error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.1\\bin\\nvcc.exe' failed with exit status 2

  ----------------------------------------
Failed building wheel for box-convolution
Running setup.py clean for box-convolution
Failed to build box-convolution

Any ideas? Thanks in advance

tensorflow version

Hello!
Impressive code for an impressive paper!
Where can I find tensorflow bindings?
Or how can I create them?

Doesn't import

Hello!

After installing with g++-6, fails to import with the following:

  1 import torch
  2

----> 3 import box_convolution_cpp_cuda as cpp_cuda
4
5 def reparametrize(

ImportError: /home/bakirillov/anaconda3/envs/lapki/lib/python3.7/site-packages/box_convolution_cpp_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

Could you please into it?

How can I run Cityscapes example on a test set?

Hello, collegues! I've trained BoxERFNet, and now I wanna run this model on a test set to evaluate it.
I checked the source code(train.py) and established 'test' in place of 'test' in 80th string. But there was falure, the evaluated metrics were incorrect(e.g. 0.0 and 0.0). Can you explain me, what I need to do to evaluate model on a test set?
I guess that problem is on 'validate' function(241th string), because confusion_matrix_update(268th string) tensors are really different in test and val sets.

Error in Test MNIST

I try to run the code test-mnist.py but I got the following error. I runing the code in colab. It would be great if you make a colab notebook with an example 😄

`
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-263240bbee7e> in <module>()
----> 1 main()

<ipython-input-18-70ccbffb2860> in main()
     29 
     30     model = Net().to(device)
---> 31     optimizer = optim.Adam(model.parameters(), lr=1e-3)
     32 
     33     for epoch in range(1, n_epochs + 1):

/usr/local/lib/python3.6/dist-packages/torch/optim/adam.py in __init__(self, params, lr, betas, eps, weight_decay, amsgrad)
     40         defaults = dict(lr=lr, betas=betas, eps=eps,
     41                         weight_decay=weight_decay, amsgrad=amsgrad)
---> 42         super(Adam, self).__init__(params, defaults)
     43 
     44     def __setstate__(self, state):

/usr/local/lib/python3.6/dist-packages/torch/optim/optimizer.py in __init__(self, params, defaults)
     43         param_groups = list(params)
     44         if len(param_groups) == 0:
---> 45             raise ValueError("optimizer got an empty parameter list")
     46         if not isinstance(param_groups[0], dict):
     47             param_groups = [{'params': param_groups}]

ValueError: optimizer got an empty parameter list `

YOLO architecture

Hi,

I want to know if there's some way I can create an architecture that'll work with YOLO. I've read a lot of implementations with pytorch but I don't know how should I modify the cfg file so that I can add box convolution layer.

Let me know.

Pretrained models

Thanks for sharing the code!
Would it be possible to share the pretrained BoxOnlyENet models?

Speed and Efficiency of Depthwise separable operation?

As far as modern libraries are concerned, there is not much support for depth-wise separable operations, i.e. we cannot write custom operations that can be done depthwise. Only convolutions are supported.

How did you apply M box convolutions to each of the N input filters, to generate NM output filters?
How is the different than using a for loop over the N input filters, applying M box convs on each one, and concatenating all the results?

[Feature Request] Upgrading to PyTorch 1.4

Hi @shrubb,

Do you have any plans to update the codebase for PyTorch 1.4 changes? If not, could you please hint at what your ideas are on how to go about this?

Installing with torch==1.4.0 fails, unfortunately.

Thanks for your awesome work!

Cityscapes dataset?

Hello,

I was wondering where I could download the Cityscapes dataset to run the example with? Thanks

Build problem!

Hi! Can't compile pls see log https://drive.google.com/open?id=1U_0axWSgQGsvvdMWv5FclS1hHHihqx9M

Command "/home/alex/anaconda3/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-n1eyvbz3/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-p0dv1roq/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-n1eyvbz3/

L2 regularization on the box convolution parameters

On the paper you mention that you use L2 regularization on the box convolution parameters to shrink the box dimensions towards zero.

Where exactly on the code you do this regularization because I can't find it. You have weight-decay flag here but it's not used on the optimizer. Also, if you use this on the optimizer the regularization is going to be applied on the whole network but you specifically mention that you apply regularization on the box convolution parameters only.

Error during forward pass

     44         input_integrated = cpp_cuda.integral_image(input)
     45         output = cpp_cuda.box_convolution_forward(
---> 46             input_integrated, x_min, x_max, y_min, y_max, normalize, exact)
     47 
     48         ctx.save_for_backward(

RuntimeError: cuda runtime error (9) : invalid configuration argument at src/box_convolution_cuda_forward.cu:249```

shrubb / box-convolutions Goto Github PK

box-convolutions's People

Contributors

Stargazers

Watchers

Forkers

box-convolutions's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs