GithubHelp home page GithubHelp logo

xiaoyangnihao / tacotron-2_melgan Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wqt2019/tacotron-2_melgan

0.0 0.0 0.0 367 KB

tacotron-2(tensorflow) + melgan(pytorch) chinese TTS

License: MIT License

Python 99.85% Dockerfile 0.15%

tacotron-2_melgan's Introduction

tacotron-2(tensorflow) + melgan(pytorch) chinese TTS:

melgan is very faster than other vocoders and the quality is not so bad. re-implement the split_func in tacotron2 that tensorflow serving not support , re-implement the nn.ReflectionPad1d that tensorrt not support. modify the melgan's input from [-12,2] to [-4,4] that match the tacotron2's output.

python37,biaobei chinese dataset,tacotron2 support chinese pinyin or chinese phone + rhythm training(default is phone + rhythm),edit symbols.py and text.py

pinyin:
000001,ka2 er2 pu3 pei2 wai4 sun1 wan2 hua2 ti1
000002,jia2 yu3 cun1 yan2 bie2 zai4 yong1 bao4 wo3
000003,bao2 ma3 pei4 gua4 bo3 luo2 an1 diao1 chan2 yuan4 zhen3 dong3 weng1 ta4

phone + rhythm(dictionary.txt):
000001,k a2 er2 p u3 #2 p ei2 uai4 s uen1 #1 uan2 h ua2 t i1 #4 。
000002,j ia2 v3 c uen1 ian2 #2 b ie2 z ai4 #1 iong1 b ao4 uo3 #4 。
000003,b ao2 m a3 #1 p ei4 g ua4 #1 b o3 l uo2 an1 #3 , d iao1 ch an2 #1 van4 zh en3 #2 d ong3 ueng1 t a4 #4 。

tacotron2

  • Step (0): python frozen_tacotron2.py, you can freeze the model.
  • Step (1): python gen_serving_model_tacotron2.py , convert pb to savemodel.

melgan

  • Step (0): python melgan2onnx.py, use MyRefPad1d() instead of nn.ReflectionPad1d() , convert pt to onnx.
  • Step (1): python onnx2trt.py , convert onnx to trt , trt support dynamic input shape.

Training and Inference:

gta:

  • Step (0): python preprocess.py ,process the audios for t2 and melgan training .
  • Step (1): python train_tacotron.py ,while finish the t2 training, it will generate gta data .
  • Step (2): cd melgan, cp audio-xxx.npy and mel-xxx.npy(gta data) to melgan's training/validing data-path .
  • Step (3): python train_melgan.py .

real mel:

  • Step (0): python preprocess.py ,process the audios for t2 and melgan training .
  • Step (1): cp audio-xxx.npy and mel-xxx.npy(real mel) to melgan's training/validing data-path .
  • Step (2): python train_tacotron.py .
  • Step (3): python train_melgan.py ,train the melgan with the real mel data.

also ,run inference_melgan.py if you only interested in vocoder .

reference:

https://github.com/Rayhane-mamah/Tacotron-2
https://github.com/seungwonpark/melgan

tacotron-2_melgan's People

Contributors

wqt2019 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.