pzhang266 Goto Github PK

followers: 26.0 following: 14.0 repos: 50.0 gists: 0.0

Name: Peng Zhang

Type: User

Company: Institute of Automation Chinese Academy of Sciences (CASIA)

Bio: Universal Audio Processing (denoise, source separation, dereverbration ...)

Location: China Beijing

Peng Zhang's Projects

acoustic-scene-analysis-with-multihead-self-attention

This repo contains implementation of the paper "Acoustic Scene Analysis With Multihead Self Attention" by Weimin Wang, Weiran Wang, Ming Sun, Chao Wang from Amazon Alexa team

aec-challenge

AEC Challenge

audiosetdl

Scripts for download AudioSet

av-se

Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

awesome-multimodal-large-language-models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

awesome-speech-enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

coder2gwy

互联网首份程序员考公指南，由3位已经进入体制内的前大厂程序员联合献上。

deepcomplexcrn

deepxi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.

dlib

A toolkit for making real world machine learning and data analysis applications in C++

dnn_wpe

emgfilters

Filter functions for processing EMG signals.

fast_bss_eval

A fast implementation of bss_eval metrics for blind source separation

fullsubnet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

gpurir

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

independent_component_analysis

From scratch Python implementation of the fast ICA algorithm.

libfacedetection

An open source library for face detection in images. The face detection speed can reach 1000FPS.

lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

lipnet-pytorch

The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)

lora

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

ml-nlp

此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现，也是作为一个算法工程师必会的理论基础知识。

mtadam

MTAdam: Automatic Balancing of Multiple Training Loss Terms

mtfaa-net

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

neural-speech-dereverberation

Machine and Deep Learning models for speech dereverberation

open_flamingo

An open-source framework for training large multimodal models.

optical-flow-guided-feature

Implementation Code of the paper Optical Flow Guided Feature, CVPR 2018

parallelwavegan

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

pedalboard

🎛 🔊 A Python library for adding effects to audio.

pzhang266 Goto Github PK

Peng Zhang's Projects

Recommend Projects

Recommend Topics

Recommend Org

Jobs