GithubHelp home page GithubHelp logo

lucylow / yeezy-taught-me Goto Github PK

View Code? Open in Web Editor NEW
10.0 3.0 1.0 702 KB

Yeezy Taught Me Text Generation. Training next character predictions RNN LSTM model with user input text corpus

Home Page: https://lucylow.github.io/Yeezy-Taught-Me/

JavaScript 82.14% HTML 17.48% CSS 0.38%
lstm-models lstm-cells recurrent-neural-networks rnn text-corpus time-series time-series-analysis time-series-classification time-series-prediction speech-recognition

yeezy-taught-me's Introduction

Yeezy Taught Me Text Generation

Status GitHub Issues GitHub Pull Requests License


Web application for machine learning training Next Character Predictions using Long Short Term Memory Model (LSTM) and Time Series Prediction. Train model to generate random text based on patterns in a given text corpus. As Kanye West said:

Lack of visual empathy, equates the meaning of L-O-V-E.


Table_of_Contents


Motivation

  • LSTM commonly used in industry by companies ike Google, Apple, Microsoft, and Amazon:

    • Time series prediction
    • Speech recognition
    • Music/rhythm learning
    • Handwriting recognition
    • Sign language translation
  • Bloomberg Business Week: “LSTM is arguably the most commercial AI achievement, used for everything from predicting diseases to composing music."


Yeezy_Taught_Me_Application

Web application for artifical intelligence model training and text generation:

Picture of program

Image. Screenshot of the web demo at https://lucylow.github.io/Yeezy-Taught-Me/


Theory: Artifical_Neural_Network

RNN and LSTM and derivatives use mainly sequential processing over time

  • Recurrent Neural Network [RNN]:

    • Used for classifying, processing, and making predictions based on time-series with time, sequence, or anything with a temporal dimension.
    • The decision a recurrent net reached at time step t - 1 affects the decision it will reach one moment later at time step t.
    • RNNs are computationally intensive - recommendation to script on GPU
    • they accept a fixed-sized vector as input (e.g. an image) and produce a fixed-sized vector as output (e.g. probabilities of different classes).
    • RNNS allow us to operate over sequences of vectors: Sequences in the input, the output, or in the most general case both.
  • Long Short Term Memory [LSTM]:

    • Special kind of RNN, capable of learning long-term dependencies that works slightly better in practice than RNN due to its more powerful update equation and backpropagation dynamics.
    • LSTM + Vanilla RNN solve the vanishing gradient problem since units allow gradient flows to be unchanged
      • Vanishing Gradient Problem : Long term information has to sequentially travel through all cells before getting to the present processing cell. This means it can be easily corrupted by being multiplied many time by small numbers < 0.
    • Neural network operates on different scales of time at once and information can be stored in, written to, or read from a cell.
    • Gates are analog with element-wise multiplication by sigmoids, which are all in the range of 0-1. Refer to diagram under " LSTM Model".

RNN and LSTM models

Image. Explanations of how the RNN and LSTM models work.


Theory: LSTM_Model

Written down as a set of equations, LSTMs look pretty intimidating.

[equations for the gates here]

LSTM Unit Map:

  • Cell (value over time interval)
  • Input gate
  • Output gate
  • Forget gate

LSTM cells from https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/recurrent_neural_networks.html

Image. LSTM cells where information can be stored in, written to, or read.


Theory: Text_Generation_Model

The LSTM model operates at the character level. It takes a tensor of shape [numExamples, sampleLen, charSetSize] as the input. The input text data is from "./data" file.

The input is a one-hot encoding of sequences of sampleLen characters. The characters belong to a set of charSetSize unique characters. With the input, the model outputs a tensor of shape [numExamples, charSetSize], which represents the model's predicted probabilites of the character that follows the input sequence.

This process is repeated in order to generate a character sequence of a given length hence the "text generation" part of the project. The randomness (diversity) is controlled by a temperature parameter.

At least 20 epochs (20 cases of the full training set) are required before the generated text starts sounding coherent.


Technical: Input Data for Text Generation

Potential text datasets to test model: https://cs.stanford.edu/people/karpathy/char-rnn/

If Yeezy Taught Me is run on new data, make sure corpus has at least ~100k characters. Ideal situation is ~1M characters.


Technical: Text_Parameters

  • 'Name of the text dataset’ for input file
  • Path to the trained next-char prediction model saved on disk
  • Length of the text to generate
  • Temperature value to use of text generation. Higher values for more random-looking results
  • CUDA GPU for training
  • Step length for how many characters to skip between one example to next

Usage

The web demo supports model training and text generation. The machine model training in done in browser, inference in browser, and the save-load operations are done with an API call to the IndexDB database.

To launch the demo, do:

yarn && yarn watch

If you try this script on new data, make sure your corpus has at least ~100k characters. ~1M is better.


Conclusion

Yeezy taught me well.


References

yeezy-taught-me's People

Contributors

lucylow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

5l1v3r1

yeezy-taught-me's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.