Comments (2)
It will be available by the end of this month. As for ASMR TTS, it probably needs more than the current framework of StyleTTS (or StyleTTS 2), because it is mostly unvoiced whisper (so F0 and energy do not make too much sense here). You may want to look for papers working on whisper speech synthesis and see if you can bring some ideas from there.
from styletts2.
What I infer is that duration accuracy is crucial to pick the speech style, it does not need to pick the exact whispered speech.
my experiments with finetuning styleTTS (with PL_BERT) using different datasets show that using CE loss, duration loss attains a value of 0.2 (starting from 1.2) during second stage training while distorting all other losses. Although I haven't really tried inferring all of them, but my desire is to match the speech duration with the ground truth, voice can be changed through a pipeline, since the inference speed of styleTTS is so fast.
with CE loss off, the best mel_loss I got was ~.23 during first stage. I keep changing lr from .00005 to .0001 as per the dataset size. Sharing all this info because I would want to meet the ideal scenario for the kind of speech I am looking to generate.
Also, I am keeping datasets with number of clips ranging for 1000-2000 for the purpose of finetuning
from styletts2.
Related Issues (20)
- FP8 Fine Tuning Crashes HOT 1
- Error Message After Using a fine tuned ASR Model
- Stage 2 Training Fails with NaN Loss on Single GPU Due to Inconsistent Checkpoint Keys
- Getting CUDA Out of memory error in Stage2 training HOT 13
- Multi-lingual training HOT 17
- In training Stage1 after 49th epoch getting RuntimeError: you can only change requires_grad flags of leaf variables, g_loss.requires_grad = True
- First stage training after 49th epoch (i.e., when epoch >= TMA_epoch)
- Getting error in d_loss.backward() of first_stage training
- Can the model learn accents not supported by espeak-ng?
- Joint training is failing with Assertion error
- In 2nd stage training AttributeError: 'AudioDiffusionConditional' object has no attribute 'module'
- Questions about Differentiable Duration Modeling HOT 1
- weird chinese pronunciation HOT 3
- Training PL-BERT on styletts2-community/multilingual-pl-bert
- Can anyone please share checkpoints that we get after we complete both stages of training HOT 3
- Model Size of fine tuned Model
- Can StyleTTS2 use phonemization from different languages to finetune or train?
- StyleTTS Python API doesn't detect devanagari script
- After training 1 epoch, train_first.py crashes: RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 1, 1, 800] HOT 1
- Do we need lr scheduler?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from styletts2.