Tinygrad is cool... or so I've heard. Going to mess around with it here. Will update.
Goal is to get a basic understanding of how it works, and then try to contribute somehow to the project.
Update submodules:
git submodule update --remote
- follow the basic MLP example from the tinygrad README, get it training
- train a model on MNIST or something
- work through all the docs
- quickstart.md
- abstractions.py
- annotate mlops.py
- I think I should go and annotate the tensor class
- line 470 -> end still TODO
- Build simple dataloader
- build a conv on MNIST
- here's some papers of best performing MNIST models
- try to implement a transformer from scratch
- https://github.com/fkodom/transformer-from-scratch
- https://fkodom.substack.com/p/transformers-from-scratch-in-pytorch
- OPTIONAL - reverse engineer and annotate the transformer example
- find a new example model architecture to add to repo with PR
- port something of Lucidrains over to tinygrad
- fft convolution
-
try some different backends, compare
-
reverse engineer the symbolic shape library
-
reverse engineer the AST linearizer (codegen)
-
check out some of the issues on GitHub
- See the CONTRIBUTING.md
- try something cool like sliding window attention, flash attention, rotary (RoPE) embeddings, speculative decoding etc.
- speed up something with this linalg paper
- reverse engineer some of these operator abstractions to see how they work
Flow: graph -> ast -> linearizer -> renderer -> compiler -> runtime
- graph -- the full graph of the computation. note that this is both the forward and backward pass, they aren't treated different
- ast -- a single GPU kernel, marked by a single reduce. a schedule runs a graph as a list of ASTs
- linearizer -- turns an AST graph into a list of UOps (a linear representation of the program)
- renderer -- uops -> the source code of the language (like C code)
- compiler -- source code -> binary (ex: .c -> .so)
- runtime -- run binary on the hardware