shigabeev / q-vits2-voice-cloning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fenrlr/mb-istft-vits2

4.0 2.0 4.0 4.19 MB

WIP: VITS 2 with quantized output of text-encoder and voice cloning

License: MIT License

Python 96.09% Jupyter Notebook 3.53% Cython 0.37%

q-vits2-voice-cloning's Introduction

The ultimate VITS2

The idea for this repo is to implement the most comprehensive VITS2 out here.

Changelist

pre-requisites

Python >= 3.8
CUDA
Pytorch version 1.13.1 (+cu117)
Clone this repository
Install python requirements.
```
pip install -r requirements.txt
```
If you want to proceed with those cleaned texts in filelists, you need to install espeak.
```
apt-get install espeak
```
Prepare datasets & configuration
1. wav files (22050Hz Mono, PCM-16)
2. Prepare text files. One for training^(ex) and one for validation^(ex). Split your dataset to each files. As shown in these examples, the datasets in validation file should be fewer than the training one, while being unique from those of training text.
  - Single speaker^(ex)
```
wavfile_path|transcript
```
  - Multi speaker^(ex)
```
wavfile_path|speaker_id|transcript
```
3. Run preprocessing with a cleaner of your interest. You may change the symbols as well.
  - Single speaker
```
python preprocess.py --text_index 1 --filelists PATH_TO_train.txt --text_cleaners CLEANER_NAME
python preprocess.py --text_index 1 --filelists PATH_TO_val.txt --text_cleaners CLEANER_NAME
```
  - Multi speaker
```
python preprocess.py --text_index 2 --filelists PATH_TO_train.txt --text_cleaners CLEANER_NAME
python preprocess.py --text_index 2 --filelists PATH_TO_val.txt --text_cleaners CLEANER_NAME
```
  The resulting cleaned text would be like this(single). ^{ex - multi}
Build Monotonic Alignment Search.

# Cython-version Monotonoic Alignment Search
cd monotonic_align
mkdir monotonic_align
python setup.py build_ext --inplace

Edit configurations based on files and cleaners you used.

Setting json file in configs

Model	How to set up json file in configs	Sample of json file configuration
iSTFT-VITS2	`"istft_vits": true,` `"upsample_rates": [8,8],`	istft_vits2_base.json
MB-iSTFT-VITS2	`"subbands": 4,` `"mb_istft_vits": true,` `"upsample_rates": [4,4],`	mb_istft_vits2_base.json
MS-iSTFT-VITS2	`"subbands": 4,` `"ms_istft_vits": true,` `"upsample_rates": [4,4],`	ms_istft_vits2_base.json
Mini-iSTFT-VITS2	`"istft_vits": true,` `"upsample_rates": [8,8],` `"hidden_channels": 96,` `"n_layers": 3,`	mini_istft_vits2_base.json
Mini-MB-iSTFT-VITS2	`"subbands": 4,` `"mb_istft_vits": true,` `"upsample_rates": [4,4],` `"hidden_channels": 96,` `"n_layers": 3,` `"upsample_initial_channel": 256,`	mini_mb_istft_vits2_base.json

Training Example

# train_ms.py for multi speaker
# train_l.py to use Lightning
python train_ms.py -c configs/shergin_d_vector_hfg.json -m models/test

Contact

If you have any questions regarding how to run it, contact us in Telegram

https://t.me/voice_stuff_chat

Credits

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.