Multi-Tacotron Voice Cloning Telegram Bot

This repository is a phonemic multilingual (Russian-English) implementation based on Multi-Tacotron-Voice-Cloning. It's a telegram bot that used toolbox from original project to clone (Russian-English) speach and make TTS.

Example

Requirements

You will need the following whether you plan to use the bot, the toolbox only or to retrain the models.

Python3.7≥ version of python ≥Python 3.6.

PyTorch (>=1.0.1).

Run pip install -r requirements.txt to install the necessary packages.

If you plan to use bot.py, you will need to create new bot with @botfarther and get your private TOKEN for it. I put a link to the tutorial in wiki section below.

A GPU is mandatory, but you don't necessarily need a high tier GPU if you only want to use the bot or toolbox.

Pretrained models

Download the latest here.

Datasets

Name	Language	Link	Comments	My link	Comments
Phoneme dictionary	En, Ru	En,Ru	Phoneme dictionary	link	Совместил русский и английский фонемный словарь
LibriSpeech	En	link	300 speakers, 360h clean speech
VoxCeleb	En	link	7000 speakers, many hours bad speech
M-AILABS	Ru	link	3 speakers, 46h clean speech
open_tts, open_stt	Ru	open_tts, open_stt	many speakers, many hours bad speech	link	Почистил 4 часа речи одного спикера. Поправил анотацию, разбил на отрезки до 7 секунд
Voxforge+audiobook	Ru	link	Many speaker, 25h various quality	link	Выбрал хорошие файлы. Разбил на отрезки. Добавил аудиокниг из интернета. Получилось 200 спикеров по паре минут на каждого
RUSLAN	Ru	link	One speaker, 40h good speech	link	Перекодировал в 16кГц
Mozilla	Ru	link	50 speaker, 30h good speech	link	Перекодировал в 16кГц, Раскидал разных пользователей по папкам
Russian Single	Ru	link	One speaker, 9h good speech	link	Перекодировал в 16кГц

Bot

You can then try the bot python bot.py

Toolbox

You can then try the toolbox:

python demo_toolbox.py -d <datasets_root>
or
python demo_toolbox.py

Wiki

[Tutorial how to get your private TOKEN for chat bot] https://www.siteguarding.com/en/how-to-get-telegram-bot-api-token

Pretrained models

Тренировка (и для других языков)

Training (and for other languages)

Papers implemented

URL	Designation	Title	Implementation source
1806.04558	SV2TTS	Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis	CorentinJ
1802.08435	WaveRNN (vocoder)	Efficient Neural Audio Synthesis	fatchord/WaveRNN
1712.05884	Tacotron 2 (synthesizer)	Natural TTS Synthesis by Conditioning Wavenet on Mel Spectrogram Predictions	Rayhane-mamah/Tacotron-2
1710.10467	GE2E (encoder)	Generalized End-To-End Loss for Speaker Verification	CorentinJ

reterno12 / multi-tacotron-voice-cloning-telegram-bot Goto Github PK