kaixhin / nninit Goto Github PK
View Code? Open in Web Editor NEWWeight initialisation schemes for Torch7 neural network modules
License: MIT License
Weight initialisation schemes for Torch7 neural network modules
License: MIT License
Is there a reason why nninit.orthogonal
does not use calcFan
and instead calculates the fanIn
/ fanOut
without taking the underlying module into consideration? Thanks!
I am experiencing inconsistencies with the orthogonal initialization. In the example below, both modules have the same number of weights but the latter is significantly faster to initialize.
th> nn.SpatialConvolution(100, 100, 3, 3):init('weight', nninit.orthogonal)
nn.SpatialConvolution(100 -> 100, 3x3)
[7.6399s]
th> nn.SpatialConvolution(100, 100, 3, 3).weight:nElement()
90000
[0.0006s]
----
th> nn.SpatialConvolution(100 * 3 * 3, 100, 1, 1):init('weight', nninit.orthogonal)
nn.SpatialConvolution(900 -> 100, 1x1)
[0.0605s]
th> nn.SpatialConvolution(100 * 3 * 3, 100, 1, 1).weight:nElement()
90000
[0.0006s]
Is this a desired behavior or a bug? The cause of this is that nninit.orthogonal uses fanIn and fanOut to determinethe size of the matrix that is ought to be orthogonalized, and it does not seem to be the right way of doing it.
local fanIn = sizes[2]
local fanOut = sizes[1]
for d = 3, #sizes do
fanIn = fanIn * sizes[d]
fanOut = fanOut * sizes[d]
end
----
nn.SpatialConvolution(100, 100, 3, 3)
fanIn: 900
fanOut: 900
----
nn.SpatialConvolution(100 * 3 * 3, 100, 1, 1)
fanIn: 900
fanOut: 100
Thank you for this very handy library.
Plancherel's theorem implies that orthogonality in the spatial domain is equivalent to orthogonality in the frequency domain. From my understanding, CAI doesn't do anything special in the frequency domain aside from simply initializing the filters of each kernel such that they form an orthonormal set. If my understanding is correct, then vanilla orthogonal initialization should accomplish the same thing, making CAI redundant.
See this Gist for a simple demo illustrating my point.
Currently nninit
is a bit clunky (in an effort to avoid side-effects). I would like to modify nn.Module
to have something like an init
method, with an API along the lines of:
nn.Linear(4096, 1000):init('weight', 'xavier', 'normal'):init('weight', 'sparse', 0.3):init('bias', 'constant', 0)
I think returning the module and therefore being able to chain calls makes it a lot more elegant. The current way of entering parameters can also be discussed. Any thoughts @soumith / @skaae?
Hi getting this error.
On Macosx
luarocks install nninit
Error: No results matching query were found.
The eye
function is wrong for the convolutional layers. In the 2D case every filter can abide by the specification for torch.eye
, and the same can be extended for 3D along the diagonal. In 1D perhaps the closest is a vector of 1s? This solution would be the most consistent with torch.eye
, which is good.
Alternatively, considering that these are convolutions, the identity would be the delta function (i.e. a 1 as close to the middle of a tensor as possible). Asking @bshillingford to clarify what he thinks makes more sense.
How would you use it for nngraph layers?
Edit2: It seems to be problem with cudnn layers?
edit: This seems awesome addition to torch :)
Although this library works fine with nngraph
, it would be good to also support rnn
- specifically the LSTM module. Given the new API introduced with #2, how can the elements of the cell be initalised individually? Any feedback @nicholas-leonard?
A notable reason to support this would be to implement the large forget gate bias introduced in:
Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural computation, 12(10), 2451-2471.
The idea of nninit
is to allow experimentation with initialisations/free maintainers from implementing "best practices".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.