Comments (8)
Ah thank you @inoryy (hi again btw!), I'm glad that we have support for this. Perhaps it would be worth including the table from Sonnet showing how to drive variance scaling in common ways:
============== ==============================================================
Name Parameters
============== ==============================================================
glorot_uniform scale=1.0, mode=``fan_avg``, distribution=``uniform``
glorot_normal scale=1.0, mode=``fan_avg``, distribution=``truncated_normal``
lecun_uniform scale=1.0, mode=``fan_in``, distribution=``uniform``
lecun_normal scale=1.0, mode=``fan_in``, distribution=``truncated_normal``
he_uniform scale=2.0, mode=``fan_in``, distribution=``uniform``
he_normal scale=2.0, mode=``fan_in``, distribution=``truncated_normal``
============== ==============================================================
https://sonnet.readthedocs.io/en/latest/api.html#variancescaling
from dm-haiku.
Just a suggestion in regards to Haiku documentation: create a topic in Initializers with "common" names like "he" and "glorot" in order for new developers to find out it inside "VarianceScaling".
from dm-haiku.
Hey, thanks for the issue. This is intentional, as we explicitly match Sonnet v2 initialization schemes.
from dm-haiku.
How "limited" are we by sonnet? Will you accept PRs that implement things that are not in sonnet?
For example other initialization schemes, activations or layers
from dm-haiku.
We aim to make it easy to port code from Sonnet to Haiku, so for core modules that can also be found in Sonnet we should match their API and defaults (in the same way jax.numpy
is aligned with numpy
).
New features are welcome 😄. In general I think in Haiku itself we should aim to only include well known modules/networks and we should make it really easy for folks to build anything custom that they want using the components in Haiku (e.g. we should be open to exposing utilities from core if needed, but we should not aim to have everything in core).
Concretely I think it would be great if you added an He initializer to Haiku as one of the initializers we support, but that we should keep the default Linear
initializer as it is today.
from dm-haiku.
Note that a variety of initializers are implicitly supported through the generic VarianceScaling
initializer. For instance, here is how to initialize a Linear
layer with the He
scheme:
hk.Linear(num_units, w_init=hk.initializers.VarianceScaling(scale=2.0))
from dm-haiku.
@tomhennigan (hello! :)) yep, that table seems like a great idea! Maybe even include it both in code and as a separate note in the docs?
from dm-haiku.
Including it in the docs seems like a great idea!
from dm-haiku.
Related Issues (20)
- More fine-grained mixed-precision configuration HOT 2
- Suggestion: alias `Transformed`(WithState) apply to __call__ HOT 2
- Is there a way to load parameters from Flax model? HOT 2
- Support model examples HOT 7
- Change to jax.interpreters.xla for JAX==0.4.14 HOT 3
- Warning: hk.LayerNorm when used in transformer decoder causes violation of autoregressive property HOT 1
- Reservoir Computing with Haiku
- Efficiency difference in using jax.lax.fori_loop vs looping over identical layers? HOT 2
- Please publish requirements.txt fix to pip
- How to use `apply` with additional parameters? HOT 1
- hk.Conv2DTranspose takes FOREVER to initialize and compile HOT 1
- 0.4.16 timeline HOT 2
- How to export haiku network parameters into Pytorch network?
- Modules got silently "reused" with `hk.vmap` HOT 2
- Wrong gradients in a Haiku network
- Direct Feedback Alignment
- Issue with wheels including docs and examples folder
- `haiku.experimental.flax` is not part of newest pip release HOT 1
- Train multiple hk.nets.MLP with one optimizer HOT 2
- TypeError: 'type' object is not subscriptable HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dm-haiku.