GithubHelp home page GithubHelp logo

mmvc_trainer's Issues

Dockerfileがあったらいただけるとありがたいです

すばらしいプロジェクトを公開してくださりありがとうございます!

WSL2+Dockerで学習してみたいので、もし既に作成済みのDockerfileをお持ちでしたら、共有していただくことは可能でしょうか?

よろしくお願いいたします!

wslで実行

torch.multiprocessing.spawn.ProcessExitedException: process 1 terminated with signal SIGSEGV
というエラーが出て実行できませんでした。
まだcolab上でしか動作しませんか?

Questions about MMVC_Trainer

I have some questions about MMVC_Trainer.

(1) G_180000.pth and D_180000.pth
In the fine_model, there are G_180000.pth and D_180000.pth model files.
What is G_180000.pth for?
What is D_180000.pth for?

(2) G_latest_99999999.pth and D_latest_99999999.pth
In the logs/20220306_24000, there are G_latest_99999999.pth and D_latest_99999999.pth model files.
What kind of training is done for G_latest_99999999.pth?
What kind of training is done for D_latest_99999999.pth?

00_Rec_Voice.ipynbで録音終了時にエラー

READMEの「Open in Colab」を押下して00_Clone_Repo.ipynbを実行後、00_Rec_Voice.ipynbを実行して録音作業を行ったところ、録音終了時に下記のエラーが発生しました。
librosa.displayのwaveplotはlibrosa 0.9で削除されたメソッドのようですが、意図せず想定よりも新しすぎるバージョンがインストールされてしまっているということでしょうか?

※ waveplotをwaveshowに書き換えれば一応動くようです。これが意図した表示かはわからないですが…

えっ嘘でしょ。
えっうそでしょ。
---終了---
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-65ce3032714f> in <cell line: 1>()
----> 1 rec(3, "emotion001", "えっ嘘でしょ。", "えっうそでしょ。")

<ipython-input-10-0d9a847134a7> in rec(sec, filename, text, hira)
     72   with open(mytext_dir + filename + ".txt", 'w') as mytext:
     73     mytext.write(hira)
---> 74   librosa.display.waveplot(speecht, sr=rate)
     75   plt.show()
     76   display(Audio(speecht, rate=rate))

AttributeError: module 'librosa.display' has no attribute 'waveplot'

Characters to be studied

In this repository, Which of the following options is the character to be studied?
(1) source character
(2) target character
(3) both source character and target character

Questions about "03_MMVC_Interface.ipynb"

I run 03_MMVC_Interface.ipynb and I have questions about it.

(1) SOURCE_SPEAKER_ID
SOURCE_SPEAKER_ID is preset as 107.
Then, I'd like to use many source speaker trained model.
How do I set the ID number for them?

(2) TARGET_ID
TARGET_ID is preset as 100.
Then, I'd like to use many target speaker trained model.
How do I set the ID number for them?

(3) TARGET_ID trained model
source speaker's trained model is saved in log folder.
Where should I put the target speakder trained model?

Train_MMVC.ipynb の train_ms.py を実行するとエラーが出ました

Google Colabにて Train_MMVC.ipynb の 以下のセルを実行した際に、

!python train_ms.py -c configs/jsontest.json -m 20220311_24000 -fg fine_model/G_232000.pth -fd fine_model/D_232000.pth

以下のようなエラーが出てました。

[INFO] {'train': {'log_interval': 1000, 'eval_interval': 4000, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0002, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 16, 'fp16_run': True, 'lr_decay': 0.999875, 'segment_size': 4096, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0}, 'data': {'training_files': 'filelists/jsontest_textful.txt', 'validation_files': 'filelists/jsontest_textful_val.txt', 'training_files_notext': 'filelists/jsontest_textless.txt', 'validation_files_notext': 'filelists/jsontest_val_textless.txt', 'text_cleaners': ['japanese_cleaners'], 'max_wav_value': 32768.0, 'sampling_rate': 24000, 'filter_length': 1024, 'hop_length': 256, 'win_length': 1024, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': None, 'add_blank': True, 'n_speakers': 104, 'cleaned_text': False}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 8, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256}, 'fine_flag': True, 'fine_model_g': 'fine_model/G_232000.pth', 'fine_model_d': 'fine_model/D_232000.pth', 'model_dir': './logs/20220311_24000'}
0it [00:00, ?it/s]
0it [00:00, ?it/s]
[INFO] FineTuning : True
[INFO] Load model : fine_model/G_232000.pth
[INFO] Load model : fine_model/D_232000.pth
Traceback (most recent call last):
  File "train_ms.py", line 303, in <module>
    main()
  File "train_ms.py", line 53, in main
    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
  File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 200, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
    while not context.join():
  File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 119, in join
    raise Exception(msg)
Exception: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
    fn(i, *args)
  File "/content/drive/MyDrive/MMVC_Trainer/train_ms.py", line 108, in run
    _, _, _, epoch_str = utils.load_checkpoint(hps.fine_model_g, net_g, optim_g)
  File "/content/drive/MyDrive/MMVC_Trainer/utils.py", line 38, in load_checkpoint
    model.module.load_state_dict(new_state_dict)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1045, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SynthesizerTrn:
	size mismatch for emb_g.weight: copying a param with shape torch.Size([106, 256]) from checkpoint, the shape in current model is torch.Size([104, 256]).

実行したノートブックのURLは以下にあります。なにかの参考になれば幸いです🙇
https://colab.research.google.com/drive/1VWYkTNjftG3MeCSdgesiPgw5NIE9E0WN?authuser=1

ONNX standalone

Hi! Thanks for the amazing open source work!

I was looking through onnx_export.py and onnx_bench.py and I was wondering how to run it end to end in a standalone Colab notebook.

Specifically, how do we replace dummy_specs = torch.rand(1, 257, 60) with a mp3/wav audio (of variable time length) converted to a torch Tensor (by rmvpe model? I'm really new to speech model architectures so not sure) with the ONNX converted checkpoint.

Thanks

MMVC_Trainerの設定ファイルの作成失敗

MMVC_Trainerの設定ファイルの作成のipynbをgoogle colabで実行した際、4番のconfig系Fileを作成するセルの出力末尾に以下のエラーが出てbaseconfigではないファイルが生成されません

...(略)
['らーてゃん。']
WARNING: JPCommonLabel_insert_pause() in jpcommon_label.c: First mora should not be short pause.
sil-r-a-a-ty-a-N-sil
dataset/textful/00_myvoice/wav/emotion099.wav|0|sil-r-a-a-ty-a-N-sil
Errordataset/textful/01_target/wav に音声データがありません

5番の確認セル出力

Directory: filelists


Directory: configs
baseconfig.json

No way to train voice for use in Colab notebook

I've seen the tutorials to set up MMVC and everything installed but the tutorial videos never explain how to generate the .json files needed to then train with Colab again. I've looked in forums with no luck. Is there something I'm missing or is that information region-locked to Japan? I really want to get this software working, I don't want to have to get a new computer and GPU to use the original w-okada version.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.