Comments (7)
Actually MatConvNet's convolution layer automatically switch to fully-connected layer if input size==kernal size.
You can manually do the same thing in torch:
example: (input size (NCHW) = 256x512x7x7, output(NxFeatureSize) = 256x4096)
model:add(nn.View(7*7*512))
model:add(nn.Linear(7*7*512,4096))
from cudnn.torch.
This is actually surprising, because the cudnn convolution has implicit, explicit gemms, as well as a bunch of other algorithms. Maybe their gemm is lagging behind the CUDA gemm.
Would you know anything about this @ngimel ?
from cudnn.torch.
For backward, the selection of algorithms is smaller (in particular, there is no explicit gemm), and they are not particularly optimized for the case where input size = kernel size. cudnn does not have a runtime dependency on cublas, and includes only a limited subset of cublas gemm kernels, so even if explicit gemm algorithms were added to backward path, there conceivably could be many situations where cudnn would be slower than cublas. I think it is best (as suggested by @vadimkantorov and @Jerrynet) to convert SpatialConvolution to Linear when input size = kernel size.
from cudnn.torch.
thanks Natalia! it is often convenient to keep SpatialConvolution for 1x1, I think we should add nn.Linear.updateOutput(self,input)
like-calls with views around for this special case
from cudnn.torch.
Sergey, please note that 1x1 SpatialConvolution in NCHW does not map directly onto Linear (it would for NHWC layout for images, and similar for filters), and for Maxwell cudnn performance for this case (NCHW) should be pretty similar to cublas anyway. I don't remember Kepler benchmarks off the top of my head. The original issue was about convolution where image H*W = kH*hW
, where cudnn performance can be pretty bad. It generally does not do too good with odd (as in: not small, not square) filter sizes, especially on backward.
from cudnn.torch.
@ngimel afaik 1x1 SpatialConvolution in NCHW DOES map to Linear. We have used this trick many times. I think it is because gemm allows transpose as a mode.
Here's a simple test case:
require 'nn'
a = nn.Linear(128, 32)
b = nn.SpatialConvolution(128, 32, 1, 1)
b.weight:copy(a.weight);
b.bias:copy(a.bias);
input = torch.randn(16, 128, 1, 1)
outlinear = a:forward(input:view(16,128))
outconv = b:forward(input)
print((outlinear - outconv):abs():max())
And the output is 8.8817841970013e-16
from cudnn.torch.
ohh, i assume that you are talking for larger inputs. Yes, indeed it does not map. It only maps correctly as you said, when H*W = kH*kW
. sorry for the confusion.
from cudnn.torch.
Related Issues (20)
- Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR(lua5.1,1080ti, CUDA8.0,cudnn5.1) HOT 6
- Error in CuDNN: CUDNN_STATUS_BAD_PARAM (cudnnGetConvolutionNdForwardOutputDim) when using VolumetricConvolution HOT 1
- Loss is NaN when using half precision HOT 3
- Is there any CuDNN bindings update plan? HOT 8
- Machine GPU type dependency: invalid device function HOT 1
- cudnn7.0 not supported even installed -R7 branch? HOT 5
- CuDNN v7 for Cuda 9.0 not working HOT 9
- cudnnConvolutionBackwardData failed - Error in CuDNN: CUDNN_STATUS_NOT_SUPPORTED (cudnnConvolutionBackwardData) HOT 8
- Slow loading time HOT 8
- Bug with GroupConvolutions with Padding using R7 branch
- CUDNN_STATUS_INTERNAL_ERROR lua torch HOT 1
- How to implement CNN+LSTM using cudnn torch
- THNN is nil HOT 1
- clearState breaks nn.MV
- ETN Mining - [CUDA] Error gpu 0: <C:/xmr-stak/xmr-stak-2.4.5/xmrstak/backend/nvidia/nvcc_code/cuda_extra.cu>:381
- Does cudnn.torch support nvidia v100 tensor cores?
- require cudnn takes 10 minutes on a Volta with 1 GPU (Cuda 9, cudnn 7.1)
- build error HOT 1
- RuntimeError: cuDNN version incompatibility: PyTorch was compiled against 7401 but linked against 7301
- 'CudaByteStorage' (a nil value)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cudnn.torch.