GithubHelp home page GithubHelp logo

diff-svc-notebooks's People

Contributors

archivoice avatar haru0l avatar mlo7ghinsan avatar

Watchers

 avatar

diff-svc-notebooks's Issues

If you get this error: 'HifiGAN' object has no attribute 'device'

#27
Got this error as well, but what I did was add a line under spec2wav:

self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

before device = self.device
and shift
/content/diff-svc/checkpoints/checkpoints/0109_hifigan_bigpopcs_hop128 to /content/diff-svc/checkpoints/0109_hifigan_bigpopcs_hop128

Step 5: Training AttributeError: 'HifiGAN' object has no attribute 'device'

;33;muse_spk_id: False, ;33;muse_split_spk_id: False, ;33;muse_uv: False, ;33;muse_var_enc: False, ;33;muse_vec: False,
;33;mval_check_interval: 1000, ;33;mvalid_num: 0, ;33;mvalid_set_name: valid, ;33;mvalidate: False, ;33;mvocoder: network.vocoders.hifigan.HifiGAN,
;33;mvocoder_ckpt: checkpoints/0109_hifigan_bigpopcs_hop128, ;33;mwarmup_updates: 2000, ;33;mwav2spec_eps: 1e-6, ;33;mweight_decay: 0, ;33;mwin_size: 512,
;33;mwork_dir: checkpoints/huji,
| Mel losses: {'ssim': 0.5, 'l1': 0.5}
12/14 03:50:58 AM gpu available: True, used: True
| load 'model' from '/content/diff-svc/pretrain/opencpop.ckpt'.
| model Trainable Parameters: 39.915M
Validation sanity check: 0% 0/1 [00:00<?, ?batch/s]
sample time step: 0% 0/100 [00:00<?, ?it/s]
sample time step: 4% 4/100 [00:00<00:02, 32.05it/s]
sample time step: 8% 8/100 [00:00<00:02, 31.75it/s]
sample time step: 12% 12/100 [00:00<00:02, 33.70it/s]
sample time step: 16% 16/100 [00:00<00:02, 34.73it/s]
sample time step: 20% 20/100 [00:00<00:02, 35.29it/s]
sample time step: 24% 24/100 [00:00<00:02, 35.64it/s]
sample time step: 28% 28/100 [00:00<00:02, 35.94it/s]
sample time step: 32% 32/100 [00:00<00:01, 35.98it/s]
sample time step: 36% 36/100 [00:01<00:01, 36.20it/s]
sample time step: 40% 40/100 [00:01<00:01, 36.23it/s]
sample time step: 44% 44/100 [00:01<00:01, 36.18it/s]
sample time step: 48% 48/100 [00:01<00:01, 36.21it/s]
sample time step: 52% 52/100 [00:01<00:01, 36.24it/s]
sample time step: 56% 56/100 [00:01<00:01, 36.24it/s]
sample time step: 60% 60/100 [00:01<00:01, 36.30it/s]
sample time step: 64% 64/100 [00:01<00:00, 36.21it/s]
sample time step: 68% 68/100 [00:01<00:00, 36.35it/s]
sample time step: 72% 72/100 [00:02<00:00, 36.35it/s]
sample time step: 76% 76/100 [00:02<00:00, 36.29it/s]
sample time step: 80% 80/100 [00:02<00:00, 36.30it/s]
sample time step: 84% 84/100 [00:02<00:00, 36.29it/s]
sample time step: 88% 88/100 [00:02<00:00, 36.29it/s]
sample time step: 92% 92/100 [00:02<00:00, 36.24it/s]
sample time step: 96% 96/100 [00:02<00:00, 36.26it/s]
sample time step: 100% 100/100 [00:02<00:00, 35.89it/s]
Traceback (most recent call last):
File "run.py", line 15, in
run_task()
File "run.py", line 11, in run_task
task_cls.start()
File "/content/diff-svc/training/task/base_task.py", line 234, in start
trainer.fit(task)
File "/content/diff-svc/utils/pl_utils.py", line 495, in fit
self.run_pretrain_routine(model)
File "/content/diff-svc/utils/pl_utils.py", line 571, in run_pretrain_routine
self.evaluate(model, self.get_val_dataloaders(), self.num_sanity_val_steps, self.testing)
File "/content/diff-svc/utils/pl_utils.py", line 1192, in evaluate
output = self.evaluation_forward(model,
File "/content/diff-svc/utils/pl_utils.py", line 1316, in evaluation_forward
output = model.validation_step(*args)
File "/content/diff-svc/training/task/SVC_task.py", line 155, in validation_step
self.plot_wav(batch_idx, sample['mels'], model_out['mel_out'], is_mel=True, gt_f0=gt_f0, f0=pred_f0)
File "/content/diff-svc/training/task/SVC_task.py", line 218, in plot_wav
gt_wav = self.vocoder.spec2wav(gt_wav, f0=gt_f0)
File "/content/diff-svc/network/vocoders/hifigan.py", line 63, in spec2wav
device = self.device
AttributeError: 'HifiGAN' object has no attribute 'device'

how to resume training

I saved the checkpoint with step 14000 in drive with Step 6: Package Model

then try to continue training by unzipping the zip in /content/diff-svc/checkpoints but the training starts from step 0 or the global step is also 0, how to continue from 14000?

ModuleNotFoundError: No module named 'parselmouth'

Trying to manually install it throws:

Collecting parselmouth
  Using cached parselmouth-1.1.1.tar.gz (33 kB)
  Preparing metadata (setup.py) ... done
Collecting googleads==3.8.0 (from parselmouth)
  Using cached googleads-3.8.0.tar.gz (23 kB)
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Training notebook TypeError

Today I attempted to use the training notebook, and everything was going alright until I got to the cell labeled "Step 5: Training". Upon running it, I get the following error:

Traceback (most recent call last):
  File "run.py", line 15, in <module>
    run_task()
  File "run.py", line 11, in run_task
    task_cls.start()
  File "/content/diff-svc/training/task/base_task.py", line 234, in start
    trainer.fit(task)
  File "/content/diff-svc/utils/pl_utils.py", line 489, in fit
    self.optimizers, self.lr_schedulers = self.init_optimizers(model.configure_optimizers())
  File "/content/diff-svc/training/task/base_task.py", line 174, in configure_optimizers
    optm = self.build_optimizer(self.model)
  File "/content/diff-svc/training/task/SVC_task.py", line 65, in build_optimizer
    weight_decay=hparams['weight_decay'])
  File "/usr/local/lib/python3.7/dist-packages/torch/optim/adamw.py", line 78, in __init__
    if not 0.0 <= lr:
TypeError: '<=' not supported between instances of 'float' and 'str'

Can I active pretrained model after 40k steps ?

Hi without using pretrained model I made 40k steps , now I want to use nyaru pertained model and continue training to 100k steps does this make voice better or I have to start from 0 again? Thank you .

Error: HifiGAN model file is not found!

i'm having problems in the 5° step of the training notebook.
i'm trying to train a voice from 0, using 44.1khz samples

complete output of step 5:

/content/diff-svc
| Hparams chains: ['/content/diff-svc/training/config_nsf.yaml']
| Hparams:
;33;mK_step: 1000, ;33;maccumulate_grad_batches: 1, ;33;maudio_num_mel_bins: 128, ;33;maudio_sample_rate: 44100, ;33;mbinarization_args: {'shuffle': False, 'with_align': True, 'with_f0': True, 'with_hubert': True, 'with_spk_embed': False, 'with_wav': False},
;33;mbinarizer_cls: preprocessing.SVCpre.SVCBinarizer, ;33;mbinary_data_dir: data/binary/dross, ;33;mcheck_val_every_n_epoch: 10, ;33;mchoose_test_manually: False, ;33;mclip_grad_norm: 1,
;33;mconfig_path: training/config_nsf.yaml, ;33;mcontent_cond_steps: [], ;33;mcwt_add_f0_loss: False, ;33;mcwt_hidden_size: 128, ;33;mcwt_layers: 2,
;33;mcwt_loss: l1, ;33;mcwt_std_scale: 0.8, ;33;mdatasets: ['opencpop'], ;33;mdebug: False, ;33;mdec_ffn_kernel_size: 9,
;33;mdec_layers: 4, ;33;mdecay_steps: 20000, ;33;mdecoder_type: fft, ;33;mdict_dir: , ;33;mdiff_decoder_type: wavenet,
;33;mdiff_loss_type: l2, ;33;mdilation_cycle_length: 4, ;33;mdropout: 0.1, ;33;mds_workers: 4, ;33;mdur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3'],
;33;mdur_loss: mse, ;33;mdur_predictor_kernel: 3, ;33;mdur_predictor_layers: 5, ;33;menc_ffn_kernel_size: 9, ;33;menc_layers: 4,
;33;mencoder_K: 8, ;33;mencoder_type: fft, ;33;mendless_ds: False, ;33;mf0_bin: 256, ;33;mf0_max: 1100.0,
;33;mf0_min: 40.0, ;33;mffn_act: gelu, ;33;mffn_padding: SAME, ;33;mfft_size: 2048, ;33;mfmax: 16000,
;33;mfmin: 40, ;33;mfs2_ckpt: , ;33;mgaussian_start: True, ;33;mgen_dir_name: , ;33;mgen_tgt_spk_id: -1,
;33;mhidden_size: 256, ;33;mhop_size: 512, ;33;mhubert_gpu: True, ;33;mhubert_path: checkpoints/hubert/hubert_soft.pt, ;33;minfer: False,
;33;mkeep_bins: 128, ;33;mlambda_commit: 0.25, ;33;mlambda_energy: 0.0, ;33;mlambda_f0: 1.0, ;33;mlambda_ph_dur: 0.3,
;33;mlambda_sent_dur: 1.0, ;33;mlambda_uv: 1.0, ;33;mlambda_word_dur: 1.0, ;33;mload_ckpt: , ;33;mlog_interval: 100,
;33;mloud_norm: False, ;33;mlr: 0.0008, ;33;mmax_beta: 0.02, ;33;mmax_epochs: 3000, ;33;mmax_eval_sentences: 1,
;33;mmax_eval_tokens: 60000, ;33;mmax_frames: 42000, ;33;mmax_input_tokens: 60000, ;33;mmax_sentences: 12, ;33;mmax_tokens: 128000,
;33;mmax_updates: 1000000, ;33;mmel_loss: ssim:0.5|l1:0.5, ;33;mmel_vmax: 1.5, ;33;mmel_vmin: -6.0, ;33;mmin_level_db: -120,
;33;mno_fs2: True, ;33;mnorm_type: gn, ;33;mnum_ckpt_keep: 10, ;33;mnum_heads: 2, ;33;mnum_sanity_val_steps: 1,
;33;mnum_spk: 1, ;33;mnum_test_samples: 0, ;33;mnum_valid_plots: 10, ;33;moptimizer_adam_beta1: 0.9, ;33;moptimizer_adam_beta2: 0.98,
;33;mout_wav_norm: False, ;33;mpe_ckpt: checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt, ;33;mpe_enable: False, ;33;mperform_enhance: True, ;33;mpitch_ar: False,
;33;mpitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5'], ;33;mpitch_extractor: parselmouth, ;33;mpitch_loss: l2, ;33;mpitch_norm: log, ;33;mpitch_type: frame,
;33;mpndm_speedup: 10, ;33;mpre_align_args: {'allow_no_txt': False, 'denoise': False, 'forced_align': 'mfa', 'txt_processor': 'zh_g2pM', 'use_sox': True, 'use_tone': False}, ;33;mpre_align_cls: data_gen.singing.pre_align.SingingPreAlign, ;33;mpredictor_dropout: 0.5, ;33;mpredictor_grad: 0.1,
;33;mpredictor_hidden: -1, ;33;mpredictor_kernel: 5, ;33;mpredictor_layers: 5, ;33;mprenet_dropout: 0.5, ;33;mprenet_hidden_size: 256,
;33;mpretrain_fs_ckpt: , ;33;mprocessed_data_dir: xxx, ;33;mprofile_infer: False, ;33;mraw_data_dir: data/raw/dross, ;33;mref_norm_layer: bn,
;33;mrel_pos: True, ;33;mreset_phone_dict: True, ;33;mresidual_channels: 384, ;33;mresidual_layers: 20, ;33;msave_best: False,
;33;msave_ckpt: True, ;33;msave_codes: ['configs', 'modules', 'src', 'utils'], ;33;msave_f0: True, ;33;msave_gt: False, ;33;mschedule_type: linear,
;33;mseed: 1234, ;33;msort_by_len: True, ;33;mspeaker_id: dross, ;33;mspec_max: [0.2816964089870453, 0.6110045313835144, 0.7528443932533264, 0.7719852328300476, 0.7578747868537903, 0.7870495319366455, 0.928855836391449, 0.915518581867218, 0.9106525182723999, 1.052038311958313, 1.0322246551513672, 0.9403936266899109, 1.0780105590820312, 1.022165298461914, 0.98377925157547, 1.0139509439468384, 1.0601212978363037, 0.9910836219787598, 0.9987587332725525, 0.8733547925949097, 0.8284812569618225, 0.8044165968894958, 0.8117375373840332, 0.7631716132164001, 0.8004911541938782, 0.8732689023017883, 0.8700592517852783, 0.837287425994873, 0.8866966366767883, 0.8396021127700806, 0.819175660610199, 0.9263454079627991, 0.880441427230835, 0.8278772234916687, 0.8070288300514221, 0.82593834400177, 0.9075272679328918, 0.7374939918518066, 0.7339229583740234, 0.5838717222213745, 0.7390212416648865, 0.5914533138275146, 0.6568486094474792, 0.7018999457359314, 0.650595486164093, 0.7557802200317383, 0.6265286803245544, 0.6484942436218262, 0.547179639339447, 0.5296093821525574, 0.40601256489753723, 0.37959158420562744, 0.4374527037143707, 0.3697531819343567, 0.30621394515037537, 0.3554210066795349, 0.3598262369632721, 0.3712518513202667, 0.216549813747406, 0.30987581610679626, 0.3893497586250305, 0.2443387508392334, 0.24721182882785797, 0.4849996268749237, 0.4686632752418518, 0.15373729169368744, 0.189516082406044, 0.1884053349494934, 0.16127777099609375, 0.3267746567726135, 0.22321538627147675, 0.12231604009866714, 0.19100888073444366, 0.06677097827196121, 0.15172165632247925, 0.004269076976925135, 0.07318542897701263, 0.0790969505906105, 0.045008596032857895, -0.0033863128628581762, -0.07382304221391678, -0.06529872864484787, -0.06318709254264832, -0.16331058740615845, -0.2981128394603729, -0.37530261278152466, -0.4302491545677185, -0.35962632298469543, -0.06664707511663437, -0.009034375660121441, -0.07002700865268707, -0.17129261791706085, -0.13444754481315613, -0.04389636218547821, 0.16330686211585999, -0.029020613059401512, -0.2405114322900772, -0.287506639957428, -0.23881807923316956, -0.22608397901058197, -0.3683353662490845, -0.4233062267303467, -0.40162914991378784, -0.3776197135448456, -0.39424625039100647, -0.4183795750141144, -0.599024772644043, -0.6727768182754517, -0.6512080430984497, -0.6985474824905396, -0.7823814749717712, -0.7961130738258362, -0.8495840430259705, -0.8956512212753296, -0.9007495045661926, -0.8376040458679199, -0.978445291519165, -0.9590984582901001, -1.0561996698379517, -1.038326621055603, -1.0919842720031738, -0.9782500267028809, -0.8888759016990662, -1.0536094903945923, -1.132426142692566, -1.1358226537704468, -1.2419252395629883, -1.0913069248199463], ;33;mspec_min: [-4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999879837036133, -4.999994277954102, -4.984480857849121, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102, -4.999994277954102],
;33;mspk_cond_steps: [], ;33;mstop_token_weight: 5.0, ;33;mtask_cls: training.task.SVC_task.SVCTask, ;33;mtest_ids: [], ;33;mtest_input_dir: ,
;33;mtest_num: 0, ;33;mtest_prefixes: ['test'], ;33;mtest_set_name: test, ;33;mtimesteps: 1000, ;33;mtrain_set_name: train,
;33;muse_crepe: True, ;33;muse_denoise: False, ;33;muse_energy_embed: False, ;33;muse_gt_dur: False, ;33;muse_gt_f0: False,
;33;muse_midi: False, ;33;muse_nsf: True, ;33;muse_pitch_embed: True, ;33;muse_pos_embed: True, ;33;muse_spk_embed: False,
;33;muse_spk_id: False, ;33;muse_split_spk_id: False, ;33;muse_uv: False, ;33;muse_var_enc: False, ;33;muse_vec: False,
;33;mval_check_interval: 1000, ;33;mvalid_num: 0, ;33;mvalid_set_name: valid, ;33;mvalidate: False, ;33;mvocoder: network.vocoders.nsf_hifigan.NsfHifiGAN,
;33;mvocoder_ckpt: checkpoints/nsf_hifigan/model, ;33;mwarmup_updates: 2000, ;33;mwav2spec_eps: 1e-6, ;33;mweight_decay: 0, ;33;mwin_size: 2048,
;33;mwork_dir: /content/drive/MyDrive/diff-svc/dross,
| Mel losses: {'ssim': 0.5, 'l1': 0.5}
Error: HifiGAN model file is not found!
12/07 12:58:18 PM gpu available: True, used: True
| model Trainable Parameters: 33.709M
Validation sanity check: 0% 0/1 [00:00<?, ?batch/s]
sample time step: 0% 0/100 [00:00<?, ?it/s]
sample time step: 6% 6/100 [00:00<00:01, 52.70it/s]
sample time step: 12% 12/100 [00:00<00:01, 50.63it/s]
sample time step: 21% 21/100 [00:00<00:01, 64.14it/s]
sample time step: 29% 29/100 [00:00<00:01, 67.10it/s]
sample time step: 37% 37/100 [00:00<00:00, 69.55it/s]
sample time step: 46% 46/100 [00:00<00:00, 73.67it/s]
sample time step: 54% 54/100 [00:00<00:00, 73.50it/s]
sample time step: 62% 62/100 [00:00<00:00, 74.58it/s]
sample time step: 71% 71/100 [00:00<00:00, 77.50it/s]
sample time step: 79% 79/100 [00:01<00:00, 75.66it/s]
sample time step: 88% 88/100 [00:01<00:00, 77.15it/s]
sample time step: 100% 100/100 [00:01<00:00, 72.19it/s]
Traceback (most recent call last):
File "run.py", line 15, in
run_task()
File "run.py", line 11, in run_task
task_cls.start()
File "/content/diff-svc/training/task/base_task.py", line 234, in start
trainer.fit(task)
File "/content/diff-svc/utils/pl_utils.py", line 495, in fit
self.run_pretrain_routine(model)
File "/content/diff-svc/utils/pl_utils.py", line 571, in run_pretrain_routine
self.evaluate(model, self.get_val_dataloaders(), self.num_sanity_val_steps, self.testing)
File "/content/diff-svc/utils/pl_utils.py", line 1192, in evaluate
output = self.evaluation_forward(model,
File "/content/diff-svc/utils/pl_utils.py", line 1316, in evaluation_forward
output = model.validation_step(*args)
File "/content/diff-svc/training/task/SVC_task.py", line 155, in validation_step
self.plot_wav(batch_idx, sample['mels'], model_out['mel_out'], is_mel=True, gt_f0=gt_f0, f0=pred_f0)
File "/content/diff-svc/training/task/SVC_task.py", line 218, in plot_wav
gt_wav = self.vocoder.spec2wav(gt_wav, f0=gt_f0)
File "/content/diff-svc/network/vocoders/nsf_hifigan.py", line 48, in spec2wav
if self.h.sampling_rate != hparams['audio_sample_rate']:
AttributeError: 'NsfHifiGAN' object has no attribute 'h'

i'm using the latest notebook, i did not have a problem with the previous one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.