Nicol Visser's Projects
Duration-penalized dynamic programming (DPDP) autoencoding recurrent neural network (AE-RNN) in Python.
Starter template for codesandbox or stackblitz
An implementation of HuBERT Base in JAX (WIP)
A tool to evaluate framewise speech features for word discrimination on LibriSpeech
Config files for my GitHub profile.
Audio player for web with spectrogram and annotations.
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Library for Textless Spoken Language Processing
š¤ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Datasets, Transforms and Models specific to Computer Vision
VQ-VAE for Acoustic Unit Discovery
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
Navigable waveform built on Web Audio and Canvas
WavLM model from knn-vc repo