Light

davidmascharka / mynn Goto Github PK

View Code? Open in Web Editor NEW

21.0 7.0 1.0 1.17 MB

Pure Python/NumPy neural network library extending MyGrad

Home Page: https://pypi.org/project/mynn/

License: MIT License

Python 100.00%

mynn's Introduction

MyNN

A pure-Python neural network library based on the amazing MyGrad.

mynn was created as an extension to mygrad for rapid prototyping of neural networks with minimal dependencies, a clean code base with excellent documentation, and as a learning tool.

Installation Instructions

If you already have MyGrad installed, clone MyNN, navigate to the resulting directory, and run

python setup.py develop

If you don't have MyGrad installed, then you can run

git clone https://github.com/rsokl/MyGrad.git
cd MyGrad
python setup.py develop

Then clone and install this repository.

Quickstart

Please see the example notebooks for a gentle introduction.

mynn's People

Contributors

Stargazers

Watchers

Forkers

iancoolidge0

mynn's Issues

Unicode variable names

In implementing some of the optimizers, I'm wondering what everyone's thoughts are on using non-ascii unicode variable names. For example, the adadelta implementation currently sits as:

self.g[idx] = self.rho * self.g[idx] + (1 - self.rho) * grad**2
dx = -np.sqrt(self.dx[idx] + self.eps) / np.sqrt(self.g[idx] + self.eps) * grad
self.dx[idx] = self.rho * self.dx[idx] + (1 - self.rho) * dx**2

However, this could easily be written this way to better match the paper:

self.g[idx] = self.ρ * self.g[idx] + (1 - self.ρ) * grad**2
Δx = -np.sqrt(self.Δx[idx] + self.ɛ) / np.sqrt(self.g[idx] + self.ɛ) * grad
self.Δx[idx] = self.ρ * self.Δx[idx] + (1 - self.ρ) * Δx**2

My initial thought: using unicode names can potentially result in some additional clarity. For non-user-facing code, this might be a good thing. We probably ought not have unicode names in function signatures, as some users may have difficulty in typing unicode characters.

Update examples

liveplot changed to noggin
everything should migrate when the mygrad merge happens
black the notebooks for code styling

Fix soft_sign documentation

Add an autoencoder example

Easy enough to put together for MNIST once MyGrad#169 is added

BatchNormNd layer

We should have one. Here's a link to the original paper.

rmsprop optimizer

The slides from Hinton talking about this are here (PDF link).

(L)-BFGS optimizer

A description of the algorithm is available here on wiki.

amsgrad optimizer

The paper is available here.

Network Layers

These are the computational layers you can use to create a model. These are essentially simple wrappers around MyGrad Tensors that hold any necessary parameters. These may take a parameter initializer that creates parameters according to the provided initialization scheme.

These should have a parameters list and define a forward and backward pass. The parameters list will be able to be passed to an Optimizer, or accessed raw. These may also define saving and loading functionality.

The Layers

Note the lack of dropout, pooling, and reshaping layers, which can be simply performed using the NumPy operations in a forward pass.

Merge MyNN functionality into MyGrad

mirror MyGrad#136

Unit Tests

Monolithic issue for unit testing things. We'll construct a list here; comment to add things to it.

Mutation: ensure that operations do not modify their input

Add gradient-clipping utilities

Contrastive loss

For two tensors

Adafactor optimizer

The paper can be found here.

Add support for exporting models

The current solution looks like iterating through the model parameters and np.saveing all the weights, then np.loading them in a new model. We should support actually serializing a model

Examples

There should be examples of using the library for things.

Spiral dataset
MNIST
~~RCNN~~

Feel free to propose additional examples as well. These are all in-progress.

Fix focal loss documentation

It's multiclass.

oh my god we need a better name

Black everything

Blocks #42

Initializers

An initializer should create a MyGrad Tensor according to some initialization scheme. Something like:

my_tensor = initializers.normal(10, 10) # 10x10 Tensor drawn from a normal distribution

The Initializers

GRU Layer

This is just a wrapper around the MyGrad GRU call but it's a nice piece to have to do all the bookkeeping of the variables.

Add CI

so we know things work. Will be very helpful in merging with MyGrad and as we add tests.

Optimizers

An optimizer should take an iterable of parameters at creation and perform some optimization over those parameters based on a loss that is passed to it. It can also be used to null the gradient of each one of those parameters. Something like the following:

# create a model in `model` and loss function in `loss_func`
optim = optimizers.sgd(model.parameters, learning_rate=0.01, momentum=0.99)

for batch in training_set:
    optim.null_gradients()
    outputs = model.forward(batch)
    loss = loss_func(outputs, targets)
    optim.step() # backprop into `model.parameters` and update each one

The Optimizers

Documentation webpage

One should exist. @rsokl I'm not sure how much interplay there will be between the MyGrad docs page and the MyNN docs page. Something we can discuss.

Activation Functions

This can certainly be open to debate, especially regarding which of/whether these belong in MyGrad instead of here. Common activation functions should be easily accessible to people, so they don't need to write their own little wrappers for things. These may include

Adagrad optimizer

The paper is availabe here (PDF link).

Loss Functions

A loss function should take model outputs and target values and return the loss. Something like:

# assuming model outputs in `outputs` and targets in `targets
loss_func = losses.L1()
loss = loss_func(outputs, targets)

The Losses

Add ability to turn bias off for conv layer

In light of the availability of batchnorm, it would be nice to off the bias term for the conv layer as we do with dense

Add an embedding example

Again could easily do an MNIST or CIFAR example with some neat visualizations. Need to do #47 first.

Add type hints

@rsokl do you think we should be supporting Python versions that this would break?

Discussion

This can host comments and discussion for now.

Some additional things that would be nice to have but need some thought:

~~Saving/Loading models~~
~~A Model class~~
A Trainer that handles data loading, training, etc
Learning Rate schedulers (possibly subsumed under Trainer)

Performance vs Clarity

We should discuss the mertis of a performance/clarity tradeoff in the library. We can take advantage of some of the math functions in MyGrad to write some neural network utilities incredibly clearly and concisely, at the cost of some performance hit.

For example, L1 before and L1 after. It's immediately obvious just looking at the current L1 implementation what it's doing; all you need is to understand how the primitives backprop. If you know that (or trust that they are) then you don't need to see details.

Another benefit of using the MyGrad primitives where applicable is the fact that we stay in step if MyGrad ops get rewritten; we don't need to update both implementations if anything changes for some reason.

However, since you're relying on the MyGrad primitives to perform the backward pass on its own, you lose out on some performance benefits of writing these operations specifically. Some of the operations incur about a 2x slowdown versus writing the forward and backward explicitly, which is not ideal. For example, I believe log softmax was about twice as slow, I believe.

Currently, my strategy is to write all operations using MyGrad primitives that we can. This helps improve development speed. My initial idea is that once we get more fully-featured we can perfom some analysis of where slowdowns are coming from and focus on optimizing those. There's little point in optimizing an op if it's not used very much or if its overhead versus another op is miniscule.

Thoughts on this tradeoff?

adadelta optimizer

The paper proposing this is here.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble