kaixhin / nninit Goto Github PK

View Code? Open in Web Editor NEW

100.0 100.0 12.0 45 KB

Weight initialisation schemes for Torch7 neural network modules

License: MIT License

Lua 100.00%

deep-learning

nninit's People

Contributors

Stargazers

Watchers

Forkers

fiskio ml-lab dyzhou2015 eriche2016 vseledkin anibali caldweln kuixu thgngu butterflyaichinese afcarl 3k-1

nninit's Issues

calcFan in nninit.orthogonal

Is there a reason why nninit.orthogonal does not use calcFan and instead calculates the fanIn / fanOut without taking the underlying module into consideration? Thanks!

inconsistencies with nninit.orthogonal

I am experiencing inconsistencies with the orthogonal initialization. In the example below, both modules have the same number of weights but the latter is significantly faster to initialize.

th> nn.SpatialConvolution(100, 100, 3, 3):init('weight', nninit.orthogonal)
nn.SpatialConvolution(100 -> 100, 3x3)
                                                                                  [7.6399s]
th> nn.SpatialConvolution(100, 100, 3, 3).weight:nElement()
90000   
                                                                                  [0.0006s]
----

th> nn.SpatialConvolution(100 * 3 * 3, 100, 1, 1):init('weight', nninit.orthogonal)
nn.SpatialConvolution(900 -> 100, 1x1)
                                                                                  [0.0605s]
th> nn.SpatialConvolution(100 * 3 * 3, 100, 1, 1).weight:nElement()
90000   
                                                                                  [0.0006s]

Is this a desired behavior or a bug? The cause of this is that nninit.orthogonal uses fanIn and fanOut to determinethe size of the matrix that is ought to be orthogonalized, and it does not seem to be the right way of doing it.

local fanIn = sizes[2]
local fanOut = sizes[1]
for d = 3, #sizes do
    fanIn = fanIn * sizes[d]
    fanOut = fanOut * sizes[d]
end

----

nn.SpatialConvolution(100, 100, 3, 3)
fanIn: 900
fanOut: 900

----

nn.SpatialConvolution(100 * 3 * 3, 100, 1, 1)
fanIn: 900
fanOut: 100

Thank you for this very handy library.

Why isn't "convolution-aware initialization" redundant?

Plancherel's theorem implies that orthogonality in the spatial domain is equivalent to orthogonality in the frequency domain. From my understanding, CAI doesn't do anything special in the frequency domain aside from simply initializing the filters of each kernel such that they form an orthonormal set. If my understanding is correct, then vanilla orthogonal initialization should accomplish the same thing, making CAI redundant.

See this Gist for a simple demo illustrating my point.

Better API

Currently nninit is a bit clunky (in an effort to avoid side-effects). I would like to modify nn.Module to have something like an init method, with an API along the lines of:

nn.Linear(4096, 1000):init('weight', 'xavier', 'normal'):init('weight', 'sparse', 0.3):init('bias', 'constant', 0)

I think returning the module and therefore being able to chain calls makes it a lot more elegant. The current way of entering parameters can also be discussed. Any thoughts @soumith / @skaae?

luarocks install nninit Error: No results matching query were found.

Hi getting this error.
On Macosx

luarocks install nninit

Error: No results matching query were found.

Specification for `eye`

The eye function is wrong for the convolutional layers. In the 2D case every filter can abide by the specification for torch.eye, and the same can be extended for 3D along the diagonal. In 1D perhaps the closest is a vector of 1s? This solution would be the most consistent with torch.eye, which is good.

Alternatively, considering that these are convolutions, the identity would be the delta function (i.e. a 1 as close to the middle of a tensor as possible). Asking @bshillingford to clarify what he thinks makes more sense.

nngraph

How would you use it for nngraph layers?
Edit2: It seems to be problem with cudnn layers?

edit: This seems awesome addition to torch :)

LSTM Support

Although this library works fine with nngraph, it would be good to also support rnn - specifically the LSTM module. Given the new API introduced with #2, how can the elements of the cell be initalised individually? Any feedback @nicholas-leonard?

A notable reason to support this would be to implement the large forget gate bias introduced in:

Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural computation, 12(10), 2451-2471.

The idea of nninit is to allow experimentation with initialisations/free maintainers from implementing "best practices".

kaixhin / nninit Goto Github PK

nninit's People

Contributors

Stargazers

Watchers

Forkers

nninit's Issues

calcFan in nninit.orthogonal

inconsistencies with nninit.orthogonal

Why isn't "convolution-aware initialization" redundant?

Better API

luarocks install nninit Error: No results matching query were found.

Specification for `eye`

nngraph

LSTM Support

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs