GithubHelp home page GithubHelp logo

zengzzzzz / minimalist-gpt Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 13 KB

Code implementation of GPT as a finite-state markov chain (finite-state markov chain)

License: Apache License 2.0

Python 100.00%

minimalist-gpt's Introduction

Minimalist-GPT

Code implementation of GPT as a finite-state markov chain

This repository contains a minimalist implementation of a GPT (Generative Pre-trained Transformer) model in PyTorch, named Minmalist-GPT. The implementation is kept minimal for educational purposes and includes functionalities for training on a simple binary sequence.

Overview

Minmalist-GPT is designed to demonstrate the core concepts of the GPT architecture in a simplified manner. The implementation includes a basic GPT model, training on a binary sequence, and a visualization of the model's state transitions.

Code Structure

The code is organized into the following main components:

  • minmalist_gpt.py: Defines the GPT model (GPT class) and related configurations (GPTConfig class).

  • main.py: Contains functions for training Minmalist-GPT on a binary sequence and visualizing the model's state transitions.

  • states-1.png: Visualization of Minmalist-GPT's initial state transitions.

  • states-2.png: Visualization of Minmalist-GPT's state transitions after training.

Requirements

torch
graphviz

Usage

To use Minmalist-GPT, you can follow the example in main.py:

# Example usage
from minmalist_gpt import GPT, GPTConfig, plot_model, token_seq_to_tensor, do_training

# Set the vocabulary size, context length, and other configuration parameters
vocab_size = 2
context_length = 3
config = GPTConfig(
    block_size=context_length,
    vocab_size=vocab_size,
    n_layer=4,
    n_head=4,
    n_embd=16,
    bias=False,
)

# Create a new Minmalist-GPT model
gpt = GPT(config)

# Plot the initial state transitions
dot = plot_model()
dot.render('states-1', format='png')

# Prepare training data sequence
seq = list(map(int, "111101111011110"))
print("\nTraining data sequence: ", seq)
X, Y = token_seq_to_tensor(seq)

# Train Minmalist-GPT
do_training(X, Y)

# Plot the state transitions after training
dot = plot_model()
dot.render('states-2', format='png')

Acknowledgments

This implementation is inspired by the GPT architecture, and special thanks to the authors of the original GPT paper.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

minimalist-gpt's People

Contributors

zengzzzzz avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.