danruta / xva-trainer Goto Github PK

UI app for training TTS/VC machine learning models for xVASynth, with several audio pre-processing tools, and dataset creation/management.

Python 70.99% HTML 1.32% JavaScript 10.06% Batchfile 0.02% CSS 0.59% Cuda 6.54% C++ 10.43% Shell 0.06%

xva-trainer's Issues

Instructions for running from source

Title. I'm pretty sure this is an electron app but I have no idea which of these files to do anything with.

Training started, no error but not working

stucked there for long time, just nothing happening

Error While Training

Settings:

Output:

18:41:18 | New Session 
18:41:18 | No graphs.json file found. Starting anew. 
18:41:18 | Dataset: C:/Program Files (x86)/Steam/steamapps/common/xVATrainer/resources/app/datasets//rdfvd_paimon 
18:41:18 | Language: English 
18:41:18 | Checkpoint: ./resources/app/python/xvapitch/pretrained_models/xVAPitch_5820651.pt 
18:41:18 | CUDA device IDs: 0 
18:41:18 | FP16: Disabled 
18:41:18 | Batch size: 6 (Base: 6, GPUs mult: 1) | GAM: 67 -> (402) | Target: 400 
18:41:18 | Outputting model backups every 3 checkpoints 
18:41:19 | Loading model and optimizer state from ./resources/app/python/xvapitch/pretrained_models/xVAPitch_5820651.pt 
18:41:20 | New voice 
18:41:20 | Workers: 3 
18:41:38 | Fine-tune dataset files: 7 
18:45:00 | Priors datasets files: 179007 | Number of datasets: 28

Error:

Traceback (most recent call last):
  File "server.py", line 227, in handleTrainingLoop
  File "python\xvapitch\xva_train.py", line 137, in handleTrainer
  File "python\xvapitch\xva_train.py", line 557, in start
  File "python\xvapitch\xva_train.py", line 604, in iteration
  File "python\xvapitch\xva_train.py", line 391, in init
  File "C:\Program Files (x86)\Steam\steamapps\common\xVATrainer\.\resources\app\python\xvapitch\get_dataset_emb.py", line 18, in get_emb
    kmeans = KMeans(n_clusters=n_clusters, random_state=0).fit(embs)
  File "sklearn\cluster\_kmeans.py", line 1376, in fit
    self._check_params(X)
  File "sklearn\cluster\_kmeans.py", line 1307, in _check_params
    super()._check_params(X)
  File "sklearn\cluster\_kmeans.py", line 828, in _check_params
    raise ValueError(
ValueError: n_samples=7 should be >= n_clusters=10.

Allow manually saving checkpoints

Given that it can take quite some time to train 2500 steps, it would be convenient to create manual checkpoints so training can resume from that.

Speaker diarization in 1.2.0+ does not work

UI error message:

ERROR:Traceback (most recent call last):
  File "server.py", line 200, in websocket_handler
  File "python\make_srt\model.py", line 91, in make_srt
KeyError: 'diarization'

server.log of version 1.2.0

Traceback (most recent call last):
  File "python\models_manager.py", line 37, in init_model
  File "python\speaker_diarization\model.py", line 26, in __init__
  File "python\speaker_diarization\model.py", line 129, in load_model
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pyannote\audio\features\__init__.py", line 33, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pyannote\audio\features\base.py", line 38, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pyannote\database\__init__.py", line 37, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pyannote\database\database.py", line 31, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pyannote\database\util.py", line 32, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\__init__.py", line 22, in <module>
    from pandas.compat import (
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\compat\__init__.py", line 15, in <module>
    from pandas.compat.numpy import (
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\compat\numpy\__init__.py", line 7, in <module>
    from pandas.util.version import Version
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\util\__init__.py", line 1, in <module>
    from pandas.util._decorators import (  # noqa
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\util\_decorators.py", line 14, in <module>
    from pandas._libs.properties import cache_readonly  # noqa
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\_libs\__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
  File "pandas\_libs\interval.pyx", line 1, in init pandas._libs.interval
  File "pandas\_libs\hashtable.pyx", line 1, in init pandas._libs.hashtable
  File "pandas\_libs\missing.pyx", line 1, in init pandas._libs.missing
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pandas\_libs\tslibs\__init__.py", line 31, in <module>
    from pandas._libs.tslibs.conversion import (
  File "pandas\_libs\tslibs\conversion.pyx", line 1, in init pandas._libs.tslibs.conversion
ModuleNotFoundError: No module named 'pandas._libs.tslibs.base'

server.log of version 1.2.1

Traceback (most recent call last):
  File "python\models_manager.py", line 37, in init_model
  File "python\speaker_diarization\model.py", line 26, in __init__
  File "python\speaker_diarization\model.py", line 129, in load_model
ImportError: cannot import name 'Pretrained' from 'pyannote.audio.features' (C:\Program Files (x86)\Steam\steamapps\common\xVATrainer\resources\app\cpython_gpu\pyannote\audio\features\__init__.pyc)

Improve training quality / synthesis

Hi, I see that the PRIORS datasets are synthetic. I plan to replace them with more natural sounding datasets (either synthetic and/or human). This will help for a better and more natural pronunciation and intonation in voice synthesis?

I also plan to make an improvement on Arpabet, since it has a problem with the double R ("RR"), in Spanish, and does not read accents.
Is it possible to improve the dictionary?

I have a mod I made some time ago for Tacotron for the Spanish language, but I don't know if it can be implemented here. I leave the link to the files below. Thanks!

https://drive.google.com/file/d/19AGqgfWiMc8MYHuH_705phCbm8-GGIDa/view?usp=drive_link

Steam release 1.2.0 seems to be broken

Multiple issues. Checking data files through Steam didn't show any error. Cleaning up the dataset didn't help.

I've got this stacktrace after starting a new training from scratch:

Traceback (most recent call last):
  File "server.py", line 227, in handleTrainingLoop
  File "python\xvapitch\xva_train.py", line 137, in handleTrainer
  File "python\xvapitch\xva_train.py", line 554, in start
  File "python\xvapitch\xva_train.py", line 601, in iteration
  File "python\xvapitch\xva_train.py", line 377, in init
  File "python\xvapitch\xva_train.py", line 1206, in setup_dataloaders
  File "C:\Program Files (x86)\Steam\steamapps\common\xVATrainer\.\resources\app\python\xvapitch\util.py", line 410, in get_language_weighted_sampler
    return WeightedRandomSampler(dataset_samples_weight, len(dataset_samples_weight))
  File "torch\utils\data\sampler.py", line 186, in __init__
    raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0

The line numbers there don't match up with xva_train.py, changes to that file to debug this are completely ignored, whereas changes to e.g. dataset.py are working fine. Throwing an exception in read_datasets shows that at least one point it's returning the correct dataset.

The UI is still broken when adding new trainings, it seems the list of trainings must be cleared in order to be able to add a new training.

Running Whisper from GPU

Hello! Is it possible to run Whisper from GPU? When transcribing the audios, I see that it uses my CPU and takes forever. I haven't seen the option to switch Whisper from CPU to GPU. I have a RTX 3060 Ti with CUDA installed. Thank you.

"Only latest" number input value is lost after restart

Just a minor thing, but the input becomes empty after restarting the app.

danruta / xva-trainer Goto Github PK

xva-trainer's Issues

Instructions for running from source

Training started, no error but not working

Error While Training

Allow manually saving checkpoints

Speaker diarization in 1.2.0+ does not work

Improve training quality / synthesis

Steam release 1.2.0 seems to be broken

Running Whisper from GPU

"Only latest" number input value is lost after restart

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs