GithubHelp home page GithubHelp logo

tubbz-alt / zhrtvc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xingmegshuo/zhrtvc

0.0 0.0 0.0 150.41 MB

Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。

Python 99.95% Shell 0.05%

zhrtvc's Introduction

zhrtvc

zhrtvc

Chinese Real Time Voice Cloning

tips: 中文或汉语的语言缩写简称是zh

关注【啊啦嘻哈】微信公众号,回复一个字【】,小萝莉有话对你说哦^v^

版本

v1.1.5

使用说明和注意事项详见readme

tips: 需要进入zhrtvc项目的代码子目录【zhrtvc】运行代码。

  • 原始语音和克隆语音对比样例

链接: https://pan.baidu.com/s/1TQwgzEIxD2VBrVZKCblN1g

提取码: 8ucd

  • 中文语音语料

中文语音语料zhvoice,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。

zhvoice语料可用于训练语音克隆的基础模型。

  • 中文语音语料训练的语音合成器模型

name: logs-synx.zip

智浪淘沙训练和分享。

用中文的文本语音平行语料训练得到的语音合成器模型。

链接: https://pan.baidu.com/s/1ovtu1n3eF7y0JzSxstQC7w

提取码: f4jx

  • 中文开源语音训练的语音编码器模型

name: ge2e_pretrained_iwater.pt

iwater训练和分享。

用中文开源语音语料训练的语音编码器模型。

链接: https://pan.baidu.com/s/1-5r_YXQOg2vZnuEh1Slxaw

提取码:19kh

  • toolbox

toolbox

  • 合成样例

aliaudio-Aibao-004113.wav

aliaudio-Aimei-007261.wav

aliaudio-Aina-000819.wav

aliaudio-Aiqi-009619.wav

aliaudio-Aitong-003149.wav

aliaudio-Aiwei-009461.wav

  • 注意

跑提供的模型建议用Griffin-Lim声码器,目前MelGAN和WaveRNN没有完全适配。

目录介绍

zhrtvc

代码,包括encoder、synthesizer、vocoder、toolbox模块,包括模型训练的模块和可视化合成语音的模块。

执行脚本需要进入zhrtvc目录操作。

代码相关的说明详见zhrtvc目录下的readme文件。

models

预训练的模型,包括encoder、synthesizer、vocoder的模型。

预训练的模型在百度网盘下载,下载后解压,替换models文件夹即可。

  • 样本模型

链接:https://pan.baidu.com/s/14hmJW7sY5PYYcCFAbqV0Kw

提取码:zl9i

data

语料样例,包括语音和文本对齐语料,处理好的用于训练synthesizer的数据样例。

可以直接执行synthesizer_preprocess_audio.pysynthesizer_preprocess_embeds.py把samples的语音文本对齐语料转为SV2TTS的用于训练synthesizer的数据。

语料样例在百度网盘下载,下载后解压,替换data文件夹即可。

  • 样本数据

链接:https://pan.baidu.com/s/1Q_WUrmb7MW_6zQSPqhX9Vw

提取码:bivr

注意: 该语料样例用于测试跑通模型,数据量太少,不可能使得模型收敛,即不会训练出可用模型。在测试跑通模型情况下,处理自己的数据为语料样例的格式,用自己的数据训练模型即可。

学习交流

【AI解决方案交流群】QQ群:925294583

点击链接加入群聊:https://jq.qq.com/?_wv=1027&k=wlQzvT0N

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented yet (don't hesitate to make an issue for that too). Mostly I would recommend giving a quick look to the figures beyond the introduction.

SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Papers implemented

URL Designation Title Implementation source
1806.04558 SV2TTS Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis This repo
1802.08435 WaveRNN (vocoder) Efficient Neural Audio Synthesis fatchord/WaveRNN
1712.05884 Tacotron 2 (synthesizer) Natural TTS Synthesis by Conditioning Wavenet on Mel Spectrogram Predictions Rayhane-mamah/Tacotron-2
1710.10467 GE2E (encoder) Generalized End-To-End Loss for Speaker Verification This repo

zhrtvc's People

Contributors

dependabot[bot] avatar kquark avatar kuangdd avatar trojblue avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.