li563042811 Goto Github PK

followers: 0.0 following: 6.0 repos: 102.0 gists: 0.0

Name: Jason's Lab

Type: User

Jason's Lab's Projects

allava

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

asr-decoder

it's ASR decoder and make graph project

asr_theory

语音识别理论，论文和PPT

asrt_speechrecognition

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

av_hubert

A self-supervised learning framework for audio-visual speech

avsr-conformer

AVSR with NIA

avsr-deep-speech

Google Summer of Code 2017 Project: Development of Speech Recognition Module for Red Hen Lab

avsr-tf1

Audio-Visual Speech Recognition using Sequence to Sequence Models

avsr_papers

This repository mainly collects the papers for transformation between three modalities: audio, visual and text..

awesome

:computer: 🎉 An awesome & curated list of best applications and tools for Windows.

awesome-multimodal-large-language-models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

baichuan-7b

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

bazel

a fast, scalable, multi-language and extensible build system

bilstm_cnn_crf_cws

BiLstm+CNN+CRF 法律文档（合同类案件）领域分词（100篇标注样本）

cif-hieradist

[INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation

colossalai

Making big AI models cheaper, easier, and more scalable

conference-acceptance-rate

Acceptance rates for the major AI conferences

d3d

The proposed method in LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild

damaihelper

支持大麦网，淘票票、缤玩岛等多个平台，演唱会演出抢票脚本

deep_avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

deepspeech

DeepSpeech is an open source speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

discriminative-multi-modality-speech-recognition

TF code for our CVPR2020 paper "Discriminative Multi-modality Speech Recognition"

e2e_lfmmi

E2E system with LF-MMI; word N-gram for Mandarin

eesen

The official repository of the Eesen project

espnet

End-to-End Speech Processing Toolkit

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.

li563042811 Goto Github PK

Jason's Lab's Projects

Recommend Projects

Recommend Topics

Recommend Org

Jobs