The vitacotron2 from toperi-nguyen

vitacotron2's Introduction

Vietnamese Speech Synthesis with End-to-End Model and Text Normalization

2020 7th NAFOSTED Conference on Information and Computer Science (NICS)

Speech synthesis systems are now getting smarter and more natural thanks to the power of deep neural networks. However, each language has a different phonological and contex- tual characteristics, we have conducted experiments, statistics, and applied Vietnamese phonetics to improve speech synthesis systems based on Tacotron2 neural networks. Our methods achieve the accuracy of 97% in text normalization task, and the synthesized speeches with a MOS score of 3.97, asymptotic to 4.43 of the voices that are directly recorded. We also provide a library for standardizing Vietnamese text called Vinorm and a package that converts text into a phonetic format called Viphoneme, which is used as an input for end-to-end neural networks, make the synthesis process faster, more intelligent and natural than using character inputs

toperi-nguyen / vitacotron2 Goto Github PK

vitacotron2's Introduction

Vietnamese Speech Synthesis with End-to-End Model and Text Normalization

2020 7th NAFOSTED Conference on Information and Computer Science (NICS)

DEMO

vitacotron2's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs