Ankit Shah's Projects
A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.
This code aims at weakly-labeled semi-supervised sound event detection. The code embraces two methods we proposed to solve this task: specialized decision surface (SDS) and disentangled feature (DF) for weakly-supervised learning and guided learning (GL) for semi-supervised learning. We're so glad if you're interested in using it for research purpose or DCASE participation.
Python library for downloading, loading & working with sound datasets
Simple Python script to download music from SoundCloud, using API or HTML scrapping 🎧🐍
SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016
Starter code for SoundSpaces challenge at CVPR 21's Embodied AI workshop
Dataset for the spaceship task from "Metacontrol for Adaptive Imagination-Based Optimization"
Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"
Estimating the Age, Height, and Gender of a speaker with their speech signal.
Implementation of Spectral Inference Networks, ICLR 2019
Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face Behind a Voice by MIT CSAIL
Assignment in Speech and audio processing - Matlab codes
Edinburgh Speech Tools
A PyTorch-based Speech Toolkit
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
This repository contains the code to reproduce the core results from the paper "Learning Latent Representations for Speech Generation and Transformation".
Library for faster pinned CPU <-> GPU transfer in Pytorch
The ultimate vim distribution
Implementation for <SphereFace: Deep Hypersphere Embedding for Face Recognition> in CVPR'17.
Tutorials for Spring 2018
Spriteworld: a flexible, configurable python-based reinforcement learning environment
Implementation of the "Efficient Video Compression via Content-Adaptive Super-Resolution" paper in Tensorflow.
Semi-supervised Domain Adaptation via Minimax Entropy
Code for "Mehta, S. V.*, Lee, J. Y.*, and Carbonell, J. (2018). Towards Semi-Supervised Learning for Deep Semantic Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4958-4963).
Official code for "Self-Supervised driven Consistency Training for Annotation Efficient Histopathology Image Analysis"
Semi-supervised learning for object detection
Pytorch implementation of SSV: Self-Supervised Viewpoint Learning from Image Collections (CVPR 2020)
Linux System Optimizer and Monitoring - https://oguzhaninan.github.io/Stacer-Web
StarNet: Gradient-Free Generative Modeling
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation