GithubHelp home page GithubHelp logo

neuralegion / shainet Goto Github PK

View Code? Open in Web Editor NEW
178.0 31.0 19.0 26.52 MB

SHAInet - a pure Crystal machine learning library

License: MIT License

Crystal 100.00%
neural-network machine-learning deep-learning deep-neural-networks crystal convolutional-neural-networks

shainet's Introduction

shainet

Crystal CI

SHAInet - stands for Super Human Artificial Intelligence network a neural network in pure Crystal

This is a free-time project, happily hosted by NeuraLegion that was created as part of some internal research. We started it with research in mind, rather than production, and just kept going, also thanks to members of the community.

We wanted to try and implement some inspiration from the biological world into this project. In addition to that, we wanted to try an approach for NNs using object-oriented modeling instead of matrices. The main reason behind that was, to try new types of neurons aiming for more robust learning (if possible) or at least have more fine-tuned control over the manipulation of each neuron (which is difficult using a matrix-driven approach).

At the Roadmap you can see what we plan to add to the network as the project will progress.

Installation

Add this to your application's shard.yml:

dependencies:
  shainet:
    github: NeuraLegion/shainet

Usage

More usage examples can be found in the specs

Standard training on XOR example

require "shainet"

training_data = [
  [[0, 0], [0]],
  [[1, 0], [1]],
  [[0, 1], [1]],
  [[1, 1], [0]],
]
# Initialize a new network
xor = SHAInet::Network.new
# Add a new layer of the input type with 2 neurons and classic neuron type (memory)
xor.add_layer(:input, 2, :memory, SHAInet.sigmoid)
# Add a new layer of the hidden type with 2 neurons and classic neuron type (memory)
xor.add_layer(:hidden, 2, :memory, SHAInet.sigmoid)
# Add a new layer of the output type with 1 neurons and classic neuron type (memory)
xor.add_layer(:output, 1, :memory, SHAInet.sigmoid)
# Fully connect the network layers
xor.fully_connect

# Adjust network parameters
xor.learning_rate = 0.7
xor.momentum = 0.3

# data, training_type, cost_function, activation_function, epochs, error_threshold (sum of errors), learning_rate, momentum)
xor.train(
      data: training_data,
      training_type: :sgdm,
      cost_function: :mse,
      epochs: 5000,
      error_threshold: 0.000001,
      log_each: 1000)

# Run the trained network
xor.run([0, 0])

Batch training on the iris dataset using adam

# Create a new Data object based on a CSV
data = SHAInet::Data.new_with_csv_input_target("iris.csv", 0..3, 4)

# Split the data in a training set and a test set
training_set, test_set = data.split(0.67)

# Initiate a new network
iris = SHAInet::Network.new

# Add layers
iris.add_layer(:input, 4, :memory, SHAInet.sigmoid)
iris.add_layer(:hidden, 5, :memory, SHAInet.sigmoid)
iris.add_layer(:output, 3, :memory, SHAInet.sigmoid)
iris.fully_connect

# Adjust network parameters
xor.learning_rate = 0.7
xor.momentum = 0.3

# Train the network
iris.train_batch(
      data: normalized.data.shuffle,
      training_type: :adam,
      cost_function: :mse,
      epochs: 20000,
      error_threshold: 0.000001,
      log_each: 1000)

# Test the network's performance
iris.test(test_set)

Using convolutional network

# Load training data (partial dataset)
raw_data = Array(Array(Float64)).new
csv = CSV.new(File.read(__DIR__ + "/test_data/mnist_train.csv"))
10000.times do
  # CSV.each_row(File.read(__DIR__ + "/test_data/mnist_train.csv")) do |row|
  csv.next
  new_row = Array(Float64).new
  csv.row.to_a.each { |value| new_row << value.to_f64 }
  raw_data << new_row
end
raw_input_data = Array(Array(Float64)).new
raw_output_data = Array(Array(Float64)).new

raw_data.each do |row|
  raw_input_data << row[1..-1]
  raw_output_data << [row[0]]
end

training_data = SHAInet::CNNData.new(raw_input_data, raw_output_data)
training_data.for_mnist_conv
training_data.data_pairs.shuffle!

# Load test data (partial dataset)
raw_data = Array(Array(Float64)).new
csv = CSV.new(File.read(__DIR__ + "/test_data/mnist_test.csv"))
1000.times do
  csv.next
  new_row = Array(Float64).new
  csv.row.to_a.each { |value| new_row << value.to_f64 }
  raw_data << new_row
end

raw_input_data = Array(Array(Float64)).new
raw_output_data = Array(Array(Float64)).new

raw_data.each do |row|
  raw_input_data << row[1..-1]
  raw_output_data << [row[0]]
end

# Load data to a CNNData helper class
test_data = SHAInet::CNNData.new(raw_input_data, raw_output_data)
test_data.for_mnist_conv # Normalize and make labels into 'one-hot' vectors

# Initialize Covnolutional network
cnn = SHAInet::CNN.new

# Add layers to the model
cnn.add_input([height = 28, width = 28, channels = 1]) # Output shape = 28x28x1
cnn.add_conv(
  filters_num: 20,
  window_size: 5,
  stride: 1,
  padding: 2,
  activation_function: SHAInet.none)  # Output shape = 28x28x20
cnn.add_relu(0.01)                    # Output shape = 28x28x20
cnn.add_maxpool(pool: = 2, stride = 2) # Output shape = 14x14x20
cnn.add_conv(
  filters_num: 20,
  window_size: 5,
  stride: 1,
  padding: 2,
  activation_function: SHAInet.none)  # Output shape = 14x14x40
cnn.add_maxpool(pool:2, stride: 2)    # Output shape = 7x7x40
cnn.add_fconnect(l_size: 10, activation_function: SHAInet.sigmoid)
cnn.add_fconnect(l_size: 10, activation_function: SHAInet.sigmoid)
cnn.add_softmax

cnn.learning_rate = 0.005
cnn.momentum = 0.02

# Train the model on the training-set
cnn.train_batch(
  data: training_data.data_pairs,
  training_type: :sgdm,
  cost_function: :mse,
  epochs: 3,
  error_threshold: 0.0001,
  log_each: 1,
  mini_batch_size: 50)

# Evaluate accuracy on the test-set
correct_answers = 0
test_data.data_pairs.each do |data_point|
  result = cnn.run(data_point[:input], stealth: true)
  if (result.index(result.max) == data_point[:output].index(data_point[:output].max))
    correct_answers += 1
  end
end

# Print the layer activations
cnn.inspect("activations")
puts "We managed #{correct_answers} out of #{test_data.data_pairs.size} total"
puts "Cnn output: #{cnn.output}"

Evolutionary optimizer example:

label = {
      "setosa"     => [0.to_f64, 0.to_f64, 1.to_f64],
      "versicolor" => [0.to_f64, 1.to_f64, 0.to_f64],
      "virginica"  => [1.to_f64, 0.to_f64, 0.to_f64],
    }

    iris = SHAInet::Network.new
    iris.add_layer(:input, 4, :memory, SHAInet.sigmoid)
    iris.add_layer(:hidden, 4, :memory, SHAInet.sigmoid)
    iris.add_layer(:output, 3, :memory, SHAInet.sigmoid)
    iris.fully_connect

    # Get data from a local file
    outputs = Array(Array(Float64)).new
    inputs = Array(Array(Float64)).new
    CSV.each_row(File.read(__DIR__ + "/test_data/iris.csv")) do |row|
      row_arr = Array(Float64).new
      row[0..-2].each do |num|
        row_arr << num.to_f64
      end
      inputs << row_arr
      outputs << label[row[-1]]
    end
    data = SHAInet::TrainingData.new(inputs, outputs)
    data.normalize_min_max

    training_data, test_data = data.split(0.9)

    iris.train_es(
      data: training_data,
      pool_size: 50,
      learning_rate: 0.5,
      sigma: 0.1,
      cost_function: :c_ent,
      epochs: 500,
      mini_batch_size: 15,
      error_threshold: 0.00000001,
      log_each: 100,
      show_slice: true)

    # Test the trained model
    correct = 0
    test_data.data.each do |data_point|
      result = iris.run(data_point[0], stealth: true)
      expected = data_point[1]
      # puts "result: \t#{result.map { |x| x.round(5) }}"
      # puts "expected: \t#{expected}"
      error_sum = 0.0
      result.size.times do |i|
        error_sum += (result[i] - expected[i]).abs
      end
      correct += 1 if error_sum < 0.3
    end
    puts "Correct answers: (#{correct} / #{test_data.size})"
    (correct > 10).should eq(true)

Development

Basic Features

  • Train network
  • Save/load
  • Activation functions:
    • Sigmoid
    • Bipolar sigmoid
    • log-sigmoid
    • Tanh
    • ReLU
    • Leaky ReLU
    • Softmax
  • Cost functions:
    • Quadratic
    • Cross-entropy
  • Gradient optimizers
    • SGD + momentum
    • iRprop+
    • ADAM
    • ES (evolutionary strategy, non-backprop)
  • Autosave during training

Advanced Features

  • Support activation functions as Proc
  • Support cost functions as Proc
  • Convolutional Neural Net.
  • Add support for multiple neuron types.
  • Bind and use CUDA (GPU acceleration)
  • graphic printout of network architecture.

Possible Future Features

  • RNN (recurant neural network)
  • LSTM (long-short term memory)
  • GNG (growing neural gas).
  • SOM (self organizing maps).
  • DBM (deep belief network).

Contributing

  1. Fork it ( https://github.com/NeuraLegion/shainet/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors

shainet's People

Contributors

artlinkov avatar bararchy avatar drujensen avatar hugoabonizio avatar psikoz avatar rmarronnier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shainet's Issues

save then load gives less accurate results

I have created a model for the Kaggle titanic competition. I ran into an issue related to saving and loading from files generates different results.

The first thing I wanted to try was to create an NN that will predict the ages of the passengers that they didn't know the age of. Before I saved the model, I was getting decent results. The ages were fairly close. This accuracy was based on exact age matches but I also manually verified that the ages were close.

Training size: 714
----------------------
T: 55 | F: 659
----------------------
Accuracy: 0.07703081232492998

After saving and loading the model and then running on the same test data, I am getting less accurate results and the ages are not close. They all come back as just a subset of possible ages.

Training size: 714
----------------------
T: 21 | F: 693
----------------------
Accuracy: 0.029411764705882353

Feedback on design

Hello team,

Like you I'm writing my own neural network library for a super niche language, Nim. I'm always interested into what other niche languages devs are doing in the NN domain (D, Elm, Elixir, Rust, Ocaml, Clojure ...) and I see that you are taking a completely different approach from most and from the state of the art (Tensorflow, PyTorch, Caffe, Mxnet) which I find interesting but also questionable.

Let's start with the interesting part.

I find your neurons/synapses approach interesting, is there any research or documentation that highlights the benefits of this modelization? I know that there was some research on modelling AI like a brain (visual cortex, sound, memory separated, a thinking part ...) that was completely put aside with the advance of gradient descent techniques, but I have a hard time getting my hands on that.

Especially I'm looking into the NEURON_TYPES = ["memory", "eraser", "amplifier", "fader", "sensor"] and the Synapse type and what would be future applications that would be eased by that approach?

Questionable design

While I find the neurons/synapses approach interesting, I have several reserves considering your current implementation, some are fixable but other might require a complete rewrite:

  1. No matrix, ndarrays, tensor type. You probably want to define custom matrix types with common functions instead of having all layers define its own loops.

  2. Storing by one neuron at a time is inefficient. I don't think the current architecture can scale to networks with millions of parameters. I don't know how Crystal classes work but if they allocate on the heap each access to a neuron will require pointer dereferencing. The main bottleneck in NN is memory access and that will make it much worse.
    Furthermore, this is unmappable to BLAS, Cuda or OpenCL for efficient computations.

  3. Synapse will be a performance bottleneck, the connection of neurons between 2 fully connected/dense/linear layers can just be represented as a matrix multiplication both for the forward and the gradient.

In the roadmap ?

Depending on your use case (research vs production), you might want to add a way to slice the data.

Using Crystal Workbook fails to create SHAINet model

I was trying to play with SHAINet using crystal play and creating a workbook similar to Jupyter:

require "shainet"

diabetes = SHAInet::Network.new

I'm getting a JSON parsing error and wanted to know if this is something you have encountered this before or if I'm doing something wrong.

Exception

Error in line 2: instantiating 'Crystal::Playground::Agent#i(Int32)'

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:25: instantiating 'send(String)'

    send "value" do |json|
    ^~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:53: instantiating 'JSON:Module#build()'

    message = JSON.build do |json|
                   ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:356: instantiating 'String:Class#build()'

    String.build do |str|
           ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/string.cr:269: instantiating 'String::Builder:Class#build(Int32)'

    String::Builder.build(capacity) do |builder|
                    ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/string.cr:269: instantiating 'String::Builder:Class#build(Int32)'

    String::Builder.build(capacity) do |builder|
                    ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:356: instantiating 'String:Class#build()'

    String.build do |str|
           ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:357: instantiating 'build(String::Builder, Nil)'

      build(str, indent) do |json|
      ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:367: instantiating 'JSON::Builder#document()'

    builder.document do
            ^~~~~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:367: instantiating 'JSON::Builder#document()'

    builder.document do
            ^~~~~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:357: instantiating 'build(String::Builder, Nil)'

      build(str, indent) do |json|
      ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:53: instantiating 'JSON:Module#build()'

    message = JSON.build do |json|
                   ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:54: instantiating 'JSON::Builder#object()'

      json.object do
           ^~~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:54: instantiating 'JSON::Builder#object()'

      json.object do
           ^~~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:25: instantiating 'send(String)'

    send "value" do |json|
    ^~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/compiler/crystal/tools/playground/agent.cr:27: instantiating 'JSON::Builder#field(String, (File::PReader | HTTP::Server::Response::Output | IO::FileDescriptor | Int32 | OpenSSL::SSL::Socket | String | Nil))'

      json.field "value", safe_to_value(value)
           ^~~~~

in /usr/local/Cellar/crystal-lang/0.23.1_3/src/json/builder.cr:226: no overload matches 'JSON::Builder#scalar' with type (File::PReader | HTTP::Server::Response::Output | IO::FileDescriptor | Int32 | OpenSSL::SSL::Socket | String | Nil)
Overloads are:
 - JSON::Builder#scalar(value : Nil)
 - JSON::Builder#scalar(value : Bool)
 - JSON::Builder#scalar(value : Int | Float)
 - JSON::Builder#scalar(value : String)
 - JSON::Builder#scalar(string = false, &block)
Couldn't find overloads for these types:
 - JSON::Builder#scalar(File::PReader)
 - JSON::Builder#scalar(HTTP::Server::Response::Output)
 - JSON::Builder#scalar(IO::FileDescriptor)
 - JSON::Builder#scalar(OpenSSL::SSL::Socket::Client)
 - JSON::Builder#scalar(OpenSSL::SSL::Socket::Server)

    scalar(value)
    ^~~~~~

If I run the same code using crystal run, I do not encounter the error.

Add a way to convert the neurons representation to matrix

So @ArtLinkov had a great idea about a way to combine the uniqueness of each neurons while also have ability to represent them as a Matrix.

The overview is creating a Matrix of pointers, the pointers will be of each bias and wights of each neurons.

This will give us the ability to add GPU and multi-threading support while also keep what we love in SHAInet.

Are there any benchmarks?

Hi all,

I've looked around a bit and can't find any benchmarks of shainet with deep learning frameworks torch, tensorflow, etc..

I know benchmarks are generally not that important, but out of curiosity, are any publicly available?

Add save/load for the CNN

Admittedly, it took a while until we got some time on our hands to work on this... :)
We should have it done by the end of the week. For now, save/load to JSON will be supported.

Autosave option while training

It would be good to have an option of autosave while training, for example every N epochs.
This could prove very useful in case something crashes or even if at some point a nasty NaN will find its way into the model (Maybe also good to preform a NaN check before saving the model),

Replace ES example?

Reading through the ES strategy here

Quote: Note on supervised learning. It is also important to note that supervised learning problems (e.g. image classification, speech recognition, or most other tasks in the industry), where one can compute the exact gradient of the loss function with backpropagation, are not directly impacted by these findings. For example, in our preliminary experiments we found that using ES to estimate the gradient on the MNIST digit recognition task can be as much as 1,000 times slower than using backpropagation. It is only in RL settings, where one has to estimate the gradient of the expected reward by sampling, where ES becomes competitive.

It seems that ES can be 1,000 time slower when doing supervised learning. The example in the readme is doing this here and I'm wondering if we should provide a better example or at least document that this is not a recommended use of this strategy?

Logo design for shainet

Hello, I want to make a logo contribution for you. I designed a logo. Please tell me what you think.

  • brain: learn
  • crystal

shainet

Help building Cats vs Dogs CNN network

I have been playing with the CNN network trying to get any results but continue to hit roadblocks and I'm hoping to get some help.

I first converted the images to 48x48x1 greyscale to try and keep things as simple as possible.

I built a network as follows:

Dimensions
layers x width x height x channels
==================================
SHAInet::InputLayer
1 x 48 x 48 x 1
----------------------------------
SHAInet::ConvLayer
20 x 48 x 48 x 1
----------------------------------
SHAInet::MaxPoolLayer
20 x 24 x 24 x 1
----------------------------------
SHAInet::ConvLayer
20 x 24 x 24 x 20
----------------------------------
SHAInet::MaxPoolLayer
20 x 12 x 12 x 1
----------------------------------
SHAInet::FullyConnectedLayer
1 x 12 x 1 x 1
----------------------------------

I have tried many different configurations for training the model but all seem to error out:

model.train_batch(
  data: training.data_pairs,
  training_type: :sgdm,
  cost_function: :mse,
  epochs: 25,
  error_threshold: 0.0001,
  log_each: 100,
  mini_batch_size: 32)

No matter what I try, I get:

I, [2019-11-02 09:35:25 -07:00 #56315]  INFO -- : Epoch: 0, Total error: 1.0, MSE: 1.0

Here is the project: https://github.com/drujensen/cats_dogs

Any ideas?

Object Oriented - move to leverage inheritance

Looking through the code, I have a couple suggested changes to make it more OO:

NEURON_TYPES should be inherited. The base class would be Neuron and then MemoryNeuron would be inherited from it.

Learn or Training functions sgd, rprop, adam should be extracted out of the Network class. These should become their own class with base class Learn or something. Adam, SGD, ...

Cost functions should also be extracted out of the Network class. Base class would be Cost with each inherited class implementing the evaluate(input, expected)

Activation functions should be its own class. ...

WDYT?

Create layer class

Add a class for layer creation to allow maximum control of topology and various types of neurons in each layer

overfitting - BatchNormalization and Dropout

I have been playing around with the titanic data and even submitted to Kaggle but my results are not so good. I'm ranking quite low (9511)

I believe my models are overfitting. I'm using sgdm and I have lowered the learning_rate and momentum to try and avoid this but the error and MSE start to rise and don't come back down.

Is the eraser layer similar to a Dropout layer? If so, how do I use it?

Is there a way to create a BatchNormalization layer?

OCR recognise image on image

How creating captcha breaker?
For example :
aaaaa

  1. How creating more output than one. I need 8 output.
  2. Meybe using found data on data, similar yolo

NaN in Total error and MSE

While using cross-entropy in batch train, sometimes the errors become NaN.
This doesn't stop the network from training but does prevent from knowing how well the training go.
Possible problem is 0/0 division in the cross_entropy_cost_derivative function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.