I tried to train the tacotron model you have on top of the LJ pretrained checkpoint yo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Floating point exception (core dumped) about wavernn HOT 10 CLOSED

fatchord commented on July 17, 2024

Floating point exception (core dumped)

from wavernn.

Comments (10)

fatchord commented on July 17, 2024

@ZohaibAhmed Unfortunately I don't see the same error on my end - can you do me a small favor? If you have an IDE with breakpoints can you check which function is causing that in gen_tacotron.py (should be somewhere in the loop starting on line 91)?

If you don't have breakpoints you can just print('a', True), print('b', True) after each function in that loop to see what's throwing the error.

Thanks.

from wavernn.

ZohaibAhmed commented on July 17, 2024

looks like the issue is on the vocoder generate function in fatchord_wavernn, specifically when it calls:

h1 = rnn1(x, h1)

Note, that just using the pretrained model out of the box seems to work. It's just when I train the model further, the error occurs.

More details about my setup:

ubuntu16.04
pytorch=1.0.0
cuda10.0
cudnn7.4.1_1
GPU: RTX 2080 Ti

from wavernn.

fatchord commented on July 17, 2024

@ZohaibAhmed can I get the exact steps you went through to get that error? Have you tried training a fresh model for a couple of epochs and then tried generating?

Also is there no other error message besides "Floating point exception (core dumped)"?

from wavernn.

ZohaibAhmed commented on July 17, 2024

@fatchord - training a model from scratch seems to work.

The exact steps I did were as follows:

take your pretrained models
get a different dataset, run preprocessor on that (the dataset is structured exactly like LJ)

Input File     : '100.wav'
Channels       : 1
Sample Rate    : 22050
Precision      : 16-bit
Duration       : 00:00:03.42 = 75411 samples ~ 256.5 CDDA sectors
File Size      : 151k
Bit Rate       : 353k
Sample Encoding: 16-bit Signed Integer PCM

Run train_tacotron.py for a bit.
Run gen_tacotron.py after the first checkpoint (i made it after 500 steps instead of the default).

And that's how I get to that error. Even if i keep the WaveRNN as the pretrained model, it still results in the Floating point exception (core dumped). Theres no other stack trace.

from wavernn.

fatchord commented on July 17, 2024

@ZohaibAhmed can you try training LJ from scratch to see if you get the same error?

from wavernn.

ZohaibAhmed commented on July 17, 2024

@fatchord training Tacotron from scratch makes it work. But I don't have enough data for my own dataset to effectively train the model.

Have you had any success with fine-tuning?

EDIT: the main issue seems to be that the decoder is producing all silent values

It looks like the shape of the output from the original pretrained model is different then when I train on top of it:

Original:
torch.Size([1, 80, 338])

Tuned:
torch.Size([1, 80, 1])

Looks like I hit the condition where if silent frames are present:

if (mel_frames < -3.8).all() : break

This is what the alignment plot looks like while training tacotron:

from wavernn.

candlewill commented on July 17, 2024

@ZohaibAhmed I met the same error. The reason is that the first frame of mel_frames is all silence (< -3.8), which makes the tacotron output empty. You could fix that by using the following code:

if (mel_frames < -3.8).all() and i > 10 : break

from wavernn.

fatchord commented on July 17, 2024

@candlewill Nice catch, I'll push a fix for that later today.

from wavernn.

ZohaibAhmed commented on July 17, 2024

@candlewill - I still largely get silence (with some static). Did you try to train your model on top of the checkpoint that @fatchord provided? Or did you just train it from scratch?

from wavernn.

fatchord commented on July 17, 2024

Tacotron has been updated to fix the premature stopping of generation.

from wavernn.

Floating point exception (core dumped) about wavernn HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs