GithubHelp home page GithubHelp logo

ppg_tacotron's People

Contributors

ljy-m avatar zeng-yifei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

ppg_tacotron's Issues

Starting from scratch

You have had success with this where many others have failed, indicated by the issues in the original repo. It might be a long shot, but would you consider putting together a very basic video showing you getting this repo working form scratch? Your instructions are clearer than other related repos but still require significant prior knowledge of these systems to follow.

apply it in Chinese

I want to apply it in Chinese. I'm working on a data set in the same format as TIMIT.
What do the first two numbers represent in the PHN file? Is it the timestamp of this phoneme?
image

Customized Voice Conversion

After training enough epochs, the result of imitating cmu arctic finally becomes more and more satisfying!

But when it comes to create a customized voice converter, I have no idea about what wav feature I should preserve when making my own "cmu arctic" dataset. Do I need to keep the every feature absolute same with cmu's dataset, like 2 second of wav duration?

Plus, I saw the issue of applying this project on mandarin. If using pyPinYin, do I need to make bigger dataset than TIMIT due to the great variance of pinyin? And what lib should I use to get every word's duration to make a phn file like TIMIT? ( which is the most confusing question for me. )

spec2wav is tow slow

spec2wav is tow slow.
I try to run this project on "GPU 1080ti, CPU Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz" and this part take a long time.
My input is a 3s audio. It take 15 seconds to process.

Is there any way to optimize it?

A Quantitative Question About Training.

I have successfully passed through the whole process from training net1 to net2 and convert.
But after training net1 for 15000 Iterations and net2 for 15000 Iterations, the convert result is still inaudible.
Can you share an experimental conclusion about how many Iterations net1&net2 training should roughly take before obtaining an acceptable result?

net1 training
Loss : [0.629394], Accuracy : [0.792945]
net 2 training
Loss : [0.009412], Loss_spec : [0.006773], Loss_mel : [0.002639]

Open .wav file in audio_operation.py: line 162 casts out error

when using 'open' function to read .wav file like this

open('/content/data/dataset/arctic/bdl/arctic_a0001.wav',encoding='utf-8').read().splitlines()

Error occurs and indicates a decoding failure:

'utf-8' codec can't decode byte 0x86 in position 4: invalid start byte

Is that due to different python version? I'm wondering why this error happens.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.