Light

satifi2 / voicelab Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 62.39 MB

语音识别和生成项目

Python 100.00%

voicelab's Introduction

voiceLab

语音识别和生成项目

腾讯云-NVIDIA-A10

我的租用心得：选账号密码而不是ssh，不然很费时。流量按量付费，速度拉满。硬盘大一点，深度学习库留100G，数据集留100G

快速配置的必要指令[显示没有安装的安装一下即可]

查看配置：
'''
free -h
nvidia-smi
lscpu
lsblk
glances
'''
下载数据集
'''
wget https://openslr.magicdatatech.com/resources/33/data_aishell.tgz
tar -xzvf data_aishell.tgz
'''
安装anaconda-pytorch环境,Anaconda3的安装包和pytorch版本未必最新可以去anaconda和pytorch官网查找最新版
'''
wget https://repo.anaconda.com/archive/Anaconda3-2023.03-Linux-x86_64.sh
bash Anaconda3-2023.03-Linux-x86_64.sh
source ~/.bashrc
conda create -n voicelab
conda activate voicelab
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
'''
远程连接
'''
vscode:remote develepment#插件，配置文件里写用户名和主机名，连接时输入密码即可
'''
安装额外的库
'''
conda install -c conda-forge librosa
pip install jiwer
pip install fast-ctcdecode
'''

关键运行截图

数据集预处理

transformer-crossentropy

初步测试

cer

conv-transformer训练的四阶段

模型大小有4.6亿参数

第一阶段：胡乱预测

第二阶段：全部预测为eos，这个时候很慌，但是不要害怕，我训练多次都是可以收敛的

第三阶段：损失减少且前面出现常见词，后面还是eos

阶段4

模型开始收敛，有意义的信息开始涌现，损失大幅下降

训练结束，交叉熵损失下降到0.02

transformer-ctc

参考论文汇总

汉语不建议分词

1905.05526 Is Word Segmentation Necessary for Deep Learning of Chinese Representations? (arxiv.org)

NEURAL MACHINE TRANSLATION

[1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate (arxiv.org)

transformer

voicelab's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs