Shaden Smith's Projects
Advent of Code 2022
Rust wrapper for Apple Matrix Coprocessor (AMX) instructions
Minimalist ML framework for Rust
Exploring phase-space methods for collisionless dark matter simulations
For sorting CSV files on disk that do not fit into memory
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Example models using DeepSpeed
The main purpose of the FireHose Streaming Benchmarks is to enable comparison of streaming software and hardware, both quantitatively vis-a-vis the rate at which they can process data, and qualitatively by judging the effort involved to implement and run the benchmarks.
Playing around with GitHub pages.
LLFI is an LLVM based fault injection tool, that injects faults into the LLVM IR of the application source code. The faults can be injected into specific program points, and the effect can be easily tracked back to the source code. LLFI is typically used to map fault characteristics back to source code, and hence understand source level or program characteristics for various kinds of fault outcomes. Please refer to paper below for more details: Anna Thomas, Karthik Pattabiraman, LLFI: An Intermediate code-level fault injector, in Workshop on Silicon Errors in Logic, System Effects (SELSE), 2013.
Ongoing research training transformer language models at scale, including: BERT & GPT-2
A functional language
Musings on debugging DeepSpeed codes.
Getting started with Azure Pipelines
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Professional website.
SimTensor: Tensor data generator for evaluation of tensor factorization algorithms
Sparse multi-dimensional arrays for the PyData ecosystem
The Surprisingly ParalleL spArse Tensor Toolkit.
SPLATT source code used in our IPDPS '17 paper.
A streaming implementation of the CPD published in SDM'18.
Noodling with sparse matrix vector multiplication in rust
Block-sparse primitives for PyTorch