daswer123 / xtts-webui Goto Github PK

XTTS-Webui is a web interface that allows you to make the most of XTTS. There are other neural networks around this interface that will improve your results. You can also fine tune the model and get a high quality voice model.

Key Features

Easy working with XTTSv2
Batch processing for dubbing a large number of files
Ability to translate any audio with voice saving
Ability to improve results using neural networks and audio tools automatically
Ability to fine tune the model and use it immediately
Ability to use tools such as: RVC, OpenVoice, Resemble Enhance, both together and separately
Ability to customize XTTS generation, all parameters, multiple speaking samples

TODO

Add a status bar with progress and error information
Integrate training into the standard interface
Add the ability to stream to check the result
Add a new way to process text for voiceover
Add the ability to customize speakers when batch processing
Add API

Installation

Use this web UI through Google Colab

Please ensure you have Python 3.10.x or Python 3.11, CUDA 11.8 or CUDA 12.1 , Microsoft Builder Tools 2019 with c++ package, and ffmpeg installed

1 Method, through scripts

Windows

To get started:

Run 'install.bat' file
To start the web UI, run 'start_xtts_webui.bat'
Open your preferred browser and go to local address displayed in console.

Linux

To get started:

Run 'install.sh' file
To start the web UI, run 'start_xtts_webui.sh'
Open your preferred browser and go to local address displayed in console.

2 Method, Manual

Follow these steps for installation:

Ensure that CUDA is installed
Clone the repository: git clone https://github.com/daswer123/xtts-webui
Navigate into the directory: cd xtts-webui
Create a virtual environment: python -m venv venv
Activate the virtual environment:
- On Windows use : venv\scripts\activate
- On linux use : source venv\bin\activate
Install PyTorch and torchaudio with pip command :

pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118
Install all dependencies from requirements.txt :

pip install -r requirements.txt

Running The Application

To launch the interface please follow these steps:

Starting XTTS WebUI :

Activate your virtual environment:

venv/scripts/activate

or if you're on Linux,

source venv/bin/activate

Then start the webui for xtts by running this command:

python app.py

Here are some runtime arguments that can be used when starting the application:

Argument	Default Value	Description
-hs, --host	127.0.0.1	The host to bind to
-p, --port	8010	The port number to listen on
-d, --device	cuda	Which device to use (cpu or cuda)
-sf,--speaker_folder	speakers/	Directory containing TTS samples
-o,--output	"output/"	Output directory
-l,--language	"auto"	Webui language, you can see the available translations in the i18n/locale folder.
-ms,--model-source	"local"	Define the model source: 'api' for latest version from repository, api inference or 'local' for using local inference and model v2.0.2
-v,-version	"v2.0.2"	You can specify which version of xtts to use. You can specify the name of the custom model for this purpose put the folder in models and specify the name of the folder in this flag
--lowvram		Enable low vram mode which switches the model to RAM when not actively processing
--deepspeed		Enable deepspeed acceleration. Works on windows on python 3.10 and 3.11
--share		Allows sharing of interface outside local computer
--rvc		Enable RVC post-processing, all models should locate in rvc folder

TTS -> RVC

Module for RVC, you can enable the RVC module to postprocess the received audio for this you need to add the --rvc flag if you are running in the console or write it to the startup file

In order for the model to work in RVC settings you need to select a model that you must first upload to the voice2voice/rvc folder, the model and index file must be together, the index file is optional, each model must be in a separate folder.

Differences between xtts-webui and the official webui

Data processing

Updated faster-whisper to 0.10.0 with the ability to select a larger-v3 model.
Changed output folder to output folder inside the main folder.
If there is already a dataset in the output folder and you want to add new data, you can do so by simply adding new audio, what was there will not be processed again and the new data will be automatically added
Turn on VAD filter
After the dataset is created, a file is created that specifies the language of the dataset. This file is read before training so that the language always matches. It is convenient when you restart the interface

Fine-tuning XTTS Encoder

Added the ability to select the base model for XTTS, as well as when you re-training does not need to download the model again.
Added ability to select custom model as base model during training, which will allow finetune already finetune model.
Added possibility to get optimized version of the model for 1 click ( step 2.5, put optimized version in output folder).
You can choose whether to delete training folders after you have optimized the model
When you optimize the model, the example reference audio is moved to the output folder
Checking for correctness of the specified language and dataset language

Inference

Added possibility to customize infer settings during model checking.

Other

If you accidentally restart the interface during one of the steps, you can load data to additional buttons
Removed the display of logs as it was causing problems when restarted
The finished result is copied to the ready folder, these are fully finished files, you can move them anywhere and use them as a standard model
Added support for Japanese here

xtts-webui's People

Contributors

Stargazers

Watchers

xtts-webui's Issues

KeyError: 'ja' when I use a Japanese wav file to finetune

My wav file is converted into mono, 22050Hz, 16bit pcm beforehand. I got this error log:

Existing language matches target language
Loading Whisper Model!
Discarding ID3 tags because more suitable tags were found.
Traceback (most recent call last):
File "D:\Long\AI\Audio\xtts-webui\xtts_finetune_webui.py", line 246, in preprocess_dataset
train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, whisper_model = whisper_model, target_language=language, out_path=out_path, gradio_progress=progress)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Long\AI\Audio\xtts-webui\scripts\utils\formatter.py", line 160, in format_audio_list
sentence = multilingual_cleaners(sentence, target_language)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Long\AI\Audio\xtts-webui\venv\Lib\site-packages\TTS\tts\layers\xtts\tokenizer.py", line 558, in multilingual_cleaners
text = expand_numbers_multilingual(text, lang)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Long\AI\Audio\xtts-webui\venv\Lib\site-packages\TTS\tts\layers\xtts\tokenizer.py", line 538, in expand_numbers_multilingual
text = re.sub(_ordinal_re[lang], lambda m: _expand_ordinal(m, lang), text)
~~~~~~~~~~~^^^^^^
KeyError: 'ja'

I got the same error at local and Colab, so maybe something is wrong with Japanese settings?

Conflict in requirements

When installing the requirements

ERROR: Cannot install -r requirements.txt (line 1), -r requirements.txt (line 11), -r requirements.txt (line 21), -r requirements.txt (line 25), -r requirements.txt (line 8) and numpy==1.22.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested numpy==1.22.0
    gradio 4.10.0 depends on numpy~=1.0
    resampy 0.4.2 depends on numpy>=1.17
    langid 1.1.5 depends on numpy
    pedalboard 0.5.10 depends on numpy
    tts 0.22.0 depends on numpy>=1.24.3; python_version > "3.10"

model.pth fails to download correctly

Stats show

2024-02-29 08:18:10.509 | INFO     | xtts_webui:<module>:57 - Start loading model v2.0.2
[XTTS] Downloading config.json...
  0%| 0.00/68.0 [00:00<?, ?iB/s]
100%| 68.0/68.0 [00:00<00:00, 67.9kiB/s]
[XTTS] Downloading model.pth...
  0%| 0.00/68.0 [00:00<?, ?iB/s]
100%| 68.0/68.0 [00:00<?, ?iB/s]
[XTTS] Downloading vocab.json...
  0%| 0.00/68.0 [00:00<?, ?iB/s]
100%| 68.0/68.0 [00:00<00:00, 19.8kiB/s]

But the model.pth is only 1kb.
Then the model load fails and the script quits.

YouTube link in the Train tab

There is a link to a YouTube tutorial video on how to train a model, sadly this is a dead link, the video is not available anymore.

Custom model.

Can you tell me what needs to be changed to load a custom model? Unfortunately, when trying to load it, incorrect weights are appearing, probably the code is not adapted to loading custom models.

Fine tuning broken on latest update

Just did a git pull to make sure its all up to date and then tried to fine tune a model (havent fully figured out how to use them(only load them) but getting them trained ready)
However clicking step one results in this error:

H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py:848: UserWarning: Expected maximum 5 arguments for function <function preprocess_dataset at 0x000002E2843D4310>, received 6.
  warnings.warn(
Running on local URL:  http://localhost:8011

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 459, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1533, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1151, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
TypeError: preprocess_dataset() takes from 4 to 5 positional arguments but 7 were given
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 459, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1533, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1151, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
TypeError: preprocess_dataset() takes from 4 to 5 positional arguments but 7 were given

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 497, in process_events
    response = await self.call_prediction(awake_events, batch)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 468, in call_prediction
    raise Exception(str(error) if show_error else None) from error
Exception: None

Can't run this without internet because it tries to connect to "geolocation.onetrust.com".

Tried to run this without internet after installing and infering.

Got this error:

It tries to connect to this website and many others.
This...is pretty shady, to be honest.
Why exactly is it contacting a geolocation site and others in order to run?

Can i run this TRULY locally, without any kind of contact to other websites?

Noticed everything's set up for V2V

In the bottom left-hand corner, in fact, I jumped out of my seat for a moment, having yet to use that feature, after spending a few days looking for a good V2V program locally and finding nothing viable. There's a xtts/rvc 'pipeline' out there, which is terribly broken -- I think V2V is the next step.

Hopefully another process aside from RVC, which seems to be the finnicky 'toitoise' to coqui's XTTS, withstanding the second part of the RVC analogy (no widely adopted solution has been found)

Perhaps RVC can be fixed, but anyhow, if you ever get a moment in your busy life, because you have the architecture already set up for V2V, I'd look into it a bit.

I've created a number of large changes now to the UI, small conveniences, info on settings, etc. and I may fork it, or I can just send it to you, it doesn't matter to me

RVC Error

The main app works as intended but RVC postprocessing gives this error:

2023-12-26 16:20:11.406 | INFO     | scripts.tts_funcs:local_generation:306 - Processing time: 44.92 seconds.
['scripts/rvc/test_infer.py', '0', 'output/output_(1)_11Labs - MALE - Matthew - British, Calm, Audiobook.wav', 'D:\\XTTSAdvancedWEBUI\\LATESTwithRVC\\xtts-webui\\rvc\\11Labs - MALE - Matthew - British, Calm, Audiobook\\added_IVF173_Flat_nprobe_1_11Labs - MALE - Matthew - British, Calm, Audiobook_v2.index', 'rmvpe', 'D:\\XTTSAdvancedWEBUI\\LATESTwithRVC\\xtts-webui\\output\\11Labs - MALE - Matthew - British, Calm, Audiobook_11Labs - MALE - Matthew - British, Calm, Audiobook_1.wav', 'D:\\XTTSAdvancedWEBUI\\LATESTwithRVC\\xtts-webui\\rvc\\11Labs - MALE - Matthew - British, Calm, Audiobook\\11Labs - MALE - Matthew - British, Calm, Audiobook.pth', '0.8', 'cuda:0', 'true', '3', '0', '1', '0.33', '128', '50', '1100', 'false']
Traceback (most recent call last):
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\xtts-webui\scripts\rvc\test_infer.py", line 119, in <module>
    from infer_pack.models import (
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\xtts-webui\scripts\rvc\infer_pack\models.py", line 6, in <module>
    from lib.infer_pack import modules
ModuleNotFoundError: No module named 'lib.infer_pack'
Traceback (most recent call last):
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\queueing.py", line 459, in call_prediction
    output = await route_utils.call_process_api(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\blocks.py", line 1542, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\blocks.py", line 1428, in postprocess_data
    outputs_cached = processing_utils.move_files_to_cache(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 265, in move_files_to_cache
    return client_utils.traverse(data, _move_to_cache, client_utils.is_file_obj)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio_client\utils.py", line 917, in traverse
    return func(json_obj)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 257, in _move_to_cache
    temp_file_path = move_resource_to_block_cache(payload.path, block)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 234, in move_resource_to_block_cache
    return block.move_resource_to_block_cache(url_or_file_path)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\blocks.py", line 257, in move_resource_to_block_cache
    temp_file_path = processing_utils.save_file_to_cache(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 170, in save_file_to_cache
    temp_dir = hash_file(file_path)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 102, in hash_file
    with open(file_path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\XTTSAdvancedWEBUI\\LATESTwithRVC\\xtts-webui\\output\\11Labs - MALE - Matthew - British, Calm, Audiobook_11Labs - MALE - Matthew - British, Calm, Audiobook_1.wav'
Traceback (most recent call last):
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\queueing.py", line 459, in call_prediction
    output = await route_utils.call_process_api(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\blocks.py", line 1542, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\blocks.py", line 1428, in postprocess_data
    outputs_cached = processing_utils.move_files_to_cache(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 265, in move_files_to_cache
    return client_utils.traverse(data, _move_to_cache, client_utils.is_file_obj)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio_client\utils.py", line 917, in traverse
    return func(json_obj)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 257, in _move_to_cache
    temp_file_path = move_resource_to_block_cache(payload.path, block)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 234, in move_resource_to_block_cache
    return block.move_resource_to_block_cache(url_or_file_path)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\blocks.py", line 257, in move_resource_to_block_cache
    temp_file_path = processing_utils.save_file_to_cache(
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 170, in save_file_to_cache
    temp_dir = hash_file(file_path)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\processing_utils.py", line 102, in hash_file
    with open(file_path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\XTTSAdvancedWEBUI\\LATESTwithRVC\\xtts-webui\\output\\11Labs - MALE - Matthew - British, Calm, Audiobook_11Labs - MALE - Matthew - British, Calm, Audiobook_1.wav'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\queueing.py", line 497, in process_events
    response = await self.call_prediction(awake_events, batch)
  File "D:\XTTSAdvancedWEBUI\LATESTwithRVC\venv\lib\site-packages\gradio\queueing.py", line 468, in call_prediction
    raise Exception(str(error) if show_error else None) from error
Exception: None

My RVC folder:

Error in install and running on windows 11 "unsuported python version on windows"

Followed the install guide, and i have a fair few python related apps on this machine.

Install deepspeed for windows for python 3.10.x and CUDA 11.8
2023-12-20 18:46:01.083 | INFO     | __main__:install_deepspeed_based_on_python_version:38 - Python version: 3.9
2023-12-20 18:46:01.083 | ERROR    | __main__:install_deepspeed_based_on_python_version:55 - Unsupported Python version on Windows.
Install complete.
Press any key to continue . . .

(venv) D:\xtta\xtts-webui>start_xtts_webui.bat
2023-12-20 18:46:54.035 | INFO     | scripts.modeldownloader:install_deepspeed_based_on_python_version:38 - Python version: 3.9
2023-12-20 18:46:54.035 | ERROR    | scripts.modeldownloader:install_deepspeed_based_on_python_version:55 - Unsupported Python version on Windows.
Traceback (most recent call last):
  File "D:\xtta\xtts-webui\app.py", line 33, in <module>
    from xtts_webui import demo
  File "D:\xtta\xtts-webui\xtts_webui.py", line 8, in <module>
    from scripts.funcs import save_audio_to_wav,resample_audio,move_and_rename_file,improve_and_convert_audio,improve_ref_audio,resemble_enchance_audio
  File "D:\xtta\xtts-webui\scripts\funcs.py", line 177, in <module>
    from scripts.resemble_enhance.enhancer.inference import denoise, enhance
  File "D:\xtta\xtts-webui\scripts\resemble_enhance\enhancer\inference.py", line 6, in <module>
    from ..inference import inference
  File "D:\xtta\xtts-webui\scripts\resemble_enhance\inference.py", line 12, in <module>
    from .hparams import HParams
  File "D:\xtta\xtts-webui\scripts\resemble_enhance\hparams.py", line 36, in <module>
    class HParams:
  File "D:\xtta\xtts-webui\scripts\resemble_enhance\hparams.py", line 105, in HParams
    def load(cls, run_dir, yaml: Path | None = None):
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

the top part is the end of the install script, the bottom part is where i ran the start script.

Error using install.bat

ERROR: Could not find a version that satisfies the requirement torch==2.1.1 (from versions: 2.2.0, 2.2.1)
ERROR: No matching distribution found for torch==2.1.1
ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'pytorch'
Install deepspeed for windows for python 3.10.x and CUDA 11.8
Traceback (most recent call last):
File "C:\Users\Jace\Desktop\xtts-webui-main\scripts\modeldownloader.py", line 4, in
import requests
ModuleNotFoundError: No module named 'requests'

Uploading reference file always fails

Traceback: https://gist.github.com/Lartza/fca95793ba0f1720a29d0c3d9ef8d8a3

A temp file is always successfully created, temp\speaker_ref_ae9e3748-400e-4641-ba7c-0f3d39b990cd.wav etc

Is Srt2TTS possible?

Instead of automatic transcription, use an SRT file previously edited by a human to create the sequence, similar to what happens with Auto-Synced-Translated-Dubs

Unable to use RVC, torchaudio not found

Good day,

I've tried to set up and use your webui and have been unable to use RVC. It seems that the script to install RVC on Linux is incorrect (should be using / path separators or os.path.join to build path). I changed this locally and the installation fails due to this error.

ERROR: Could not find a version that satisfies the requirement torchaudio==0.13.1+cu117 (from versions: 2.0.0+cu117, 2.0.1+cu117, 2.0.2+cu117)
ERROR: No matching distribution found for torchaudio==0.13.1+cu117

How do you use the fine tuned models (not load, but USE)

I can train in the webui but i cant use the models!
Both the api and the webui use the .wav file for speakers not the finetuned model even when its loaded.
EDIT:
As I say I can load the model in the webui once trained but I have to select a speaker wav file, it wont let me use the fine tuned model.
Same issue with the api, you can load it fully but when you send a json packet you have to include the speaker wav file and thus bypass the fine tuning.

Can you give some instructions on how to use the fine tuned model please.

Feature Enhancement Request: Integration of Inbuilt Studio Voices into UI

The current version of xtts boasts support for 58 inbuilt studio voices. To enhance user experience and streamline accessibility, I propose integrating these voices directly into the user interface (UI).

Deepspeed RuntimeError: Workspace can't be allocated, no enough memory. - Win 11

This project works on my Win 10 computer but gives this error on my friend's Win 11. Using this without --deepspeed option fixes the issue and it generates text successfully but it's much slower without deepspeed. How can we fix this?

To create a public link, set `share=True` in `launch()`.
Using ready reference
Requested:      104873984
Free:   49217536
Total:  4294443008
Traceback (most recent call last):
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\gradio\blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\xtts-webui\modules\generation.py", line 169, in generate_audio
    output_file = XTTS.process_tts_to_file(text, lang_code, ref_speaker_wav, options, output_file_path)
  File "C:\Users\furka\AI\xtts\xtts-webui\scripts\tts_funcs.py", line 378, in process_tts_to_file
    raise e  # Propagate exceptions for endpoint handling.
  File "C:\Users\furka\AI\xtts\xtts-webui\scripts\tts_funcs.py", line 370, in process_tts_to_file
    self.local_generation(clear_text,ref_speaker_wav,speaker_wav,language,options,output_file)
  File "C:\Users\furka\AI\xtts\xtts-webui\scripts\tts_funcs.py", line 287, in local_generation
    out = self.model.inference(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\TTS\tts\models\xtts.py", line 541, in inference
    gpt_codes = self.gpt.generate(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\TTS\tts\layers\xtts\gpt.py", line 590, in generate
    gen = self.gpt_inference.generate(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\transformers\generation\utils.py", line 1764, in generate
    return self.sample(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\transformers\generation\utils.py", line 2861, in sample
    outputs = self(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\TTS\tts\layers\xtts\gpt_inference.py", line 97, in forward
    transformer_outputs = self.transformer(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 888, in forward
    outputs = block(
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\furka\AI\xtts\venv\lib\site-packages\deepspeed\model_implementations\transformers\ds_transformer.py", line 141, in forward
    self.allocate_workspace(self.config.hidden_size, self.config.heads,
RuntimeError: Workspace can't be allocated, no enough memory.

Windows installation failed

Japanese doesn't work

When trying to generate something in Japanese, I get this error
At first it said something about cutlet being missing, so I pip installed cutlet and then this error appeared
This is the error

Traceback (most recent call last):
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\queueing.py", line 459, in call_prediction
output = await route_utils.call_process_api(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\blocks.py", line 1533, in process_api
result = await self.call_function(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\blocks.py", line 1151, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
response = f(*args, **kwargs)
File "C:\Users\lccji\Downloads\xtts-webui-main\xtts_webui.py", line 194, in generate_audio
output_file = XTTS.process_tts_to_file(text, lang_code, ref_speaker_wav, options, output_file_path)
File "C:\Users\lccji\Downloads\xtts-webui-main\scripts\tts_funcs.py", line 293, in process_tts_to_file
raise e # Propagate exceptions for endpoint handling.
File "C:\Users\lccji\Downloads\xtts-webui-main\scripts\tts_funcs.py", line 287, in process_tts_to_file
self.api_generation(clear_text,speaker_wav,language,options,output_file)
File "C:\Users\lccji\Downloads\xtts-webui-main\scripts\tts_funcs.py", line 239, in api_generation
self.model.tts_to_file(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\api.py", line 334, in tts_to_file
wav = self.tts(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\api.py", line 276, in tts
wav = self.synthesizer.tts(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\utils\synthesizer.py", line 386, in tts
outputs = self.tts_model.synthesize(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\tts\models\xtts.py", line 419, in synthesize
return self.full_inference(text, speaker_wav, language, **settings)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\tts\models\xtts.py", line 488, in full_inference
return self.inference(
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\tts\models\xtts.py", line 534, in inference
text_tokens = torch.IntTensor(self.tokenizer.encode(sent, lang=language)).unsqueeze(0).to(self.device)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\tts\layers\xtts\tokenizer.py", line 649, in encode
txt = self.preprocess_text(txt, lang)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\tts\layers\xtts\tokenizer.py", line 638, in preprocess_text
txt = japanese_cleaners(txt, self.katsu)
File "C:\Users\lccji\AppData\Local\Programs\Python\Python310\lib\functools.py", line 981, in get
val = self.func(instance)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\TTS\tts\layers\xtts\tokenizer.py", line 620, in katsu
return cutlet.Cutlet()
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\cutlet\cutlet.py", line 148, in init
self.tagger = fugashi.Tagger(mecab_args)
File "fugashi\fugashi.pyx", line 383, in fugashi.fugashi.Tagger.init
File "fugashi\fugashi.pyx", line 232, in fugashi.fugashi.GenericTagger.init
RuntimeError:
Failed initializing MeCab. Please see the README for possible solutions:

https://github.com/polm/fugashi

If you are still having trouble, please file an issue here, and include the
ERROR DETAILS below:

https://github.com/polm/fugashi/issues

issueを英語で書く必要はありません。

------------------- ERROR DETAILS ------------------------
arguments: [b'fugashi', b'-C']
param.cpp(69) [ifs] no such file or directory: c:\mecab\mecabrc

https://github.com/polm/fugashi

If you are still having trouble, please file an issue here, and include the
ERROR DETAILS below:

https://github.com/polm/fugashi/issues

issueを英語で書く必要はありません。

------------------- ERROR DETAILS ------------------------
arguments: [b'fugashi', b'-C']
param.cpp(69) [ifs] no such file or directory: c:\mecab\mecabrc

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\queueing.py", line 497, in process_events
response = await self.call_prediction(awake_events, batch)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\gradio\queueing.py", line 468, in call_prediction
raise Exception(str(error) if show_error else None) from error
Exception: None

Can i use command line to translate from voice to anther language without webui?

DeepSpeed isn't optional

Even though DeepSpeed has an argument, you cannot run the program without it being installed. Even after removing the DeepSpeed imports it still wants to use DeepSpeed stuff, so you can never make it to the UI.

This means ROCm users are left out. I think DeepSpeed support may have been added in ROCm 6.0, but it's definitely not in ROCm 5.7 or below. The rest of the program likely works fine, as xtts-api and other projects work fine. It's just DeepSpeed which is preventing it from working.

If you could please make it so that DeepSpeed only gets imported when its needed, and also only if --deepspeed is used, I would be very grateful.

The new "deepspeed" requires env PATH CUDA_HOME

My python may be acting funny, but after trouble-shooting, I came up the error that the deepspeed 'wheel' wasn't acting correctly, and in the error info it said there was no path to CUDA_HOME (as an environment variable)

Do you have that on your system? If so, you should put this link in your README.

https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local

If you didn't have to install that, and add it to your ENV PATHs, then it may be a different issue altogether. Note: I haven't yet tried for this solution, I wanted to see if you knew anything about it first, considering there's nothing directly concerning it on the README.

EDIT:
I looked into your changes, noticed that you changed the 11.8 torch to the newer version -- which my GPU isn't compatible with, with its current drivers (and I've been told to be careful upgrading GPU drivers if you use SD because the newer ones may really slow down inferencing, due to some offloading to RAM issue)
Does this deepspeed require 12.1? Or can I get away with possibly deleting it and throwing 11.8 in there instead?

Or is the initial root of the issue what I, initially thought it was? Thanks a million, loving the improvements.

Colab notebook dependency errors

Hi, The Colab notebook is giving dependency errors. Thanks for the help.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires kaleido, which is not installed.
llmx 0.0.15a0 requires cohere, which is not installed.
llmx 0.0.15a0 requires openai, which is not installed.
llmx 0.0.15a0 requires tiktoken, which is not installed.
bigframes 0.19.2 requires tabulate>=0.9, but you have tabulate 0.8.10 which is incompatible.
plotnine 0.12.4 requires numpy>=1.23.0, but you have numpy 1.22.0 which is incompatible.
pywavelets 1.5.0 requires numpy<2.0,>=1.22.4, but you have numpy 1.22.0 which is incompatible.
tensorflow 2.15.0 requires numpy<2.0.0,>=1.23.5, but you have numpy 1.22.0 which is incompatible.
tensorflow-probability 0.22.0 requires typing-extensions<4.6.0, but you have typing-extensions 4.9.0 which is incompatible.
torchdata 0.7.0 requires torch==2.1.0, but you have torch 2.1.1 which is incompatible.
torchtext 0.16.0 requires torch==2.1.0, but you have torch 2.1.1 which is incompatible.
torchvision 0.16.0+cu121 requires torch==2.1.0, but you have torch 2.1.1 which is incompatible.
...
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires kaleido, which is not installed.
tts 0.22.0 requires numpy==1.22.0; python_version <= "3.10", but you have numpy 1.26.2 which is incompatible.
tensorflow-probability 0.22.0 requires typing-extensions<4.6.0, but you have typing-extensions 4.9.0 which is incompatible.
torchtext 0.16.0 requires torch==2.1.0, but you have torch 2.1.1 which is incompatible.
torchvision 0.16.0+cu121 requires torch==2.1.0, but you have torch 2.1.1 which is incompatible.

Finetuning in webUI broken

New thread as its a different issue this time.

So I load the webui,
Click on train
In finetune model name I write commentator
I drop my audio sample ont the audio input section and wait for it to upload.
I click train.

I get this error:

Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\modules\train.py", line 239, in train_xtts_model
    shutil.copytree(str(ready_folder), str(
  File "C:\Program Files\Python310\lib\shutil.py", line 556, in copytree
    with os.scandir(src) as itr:
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'finetuned_models\\commentator\\ready'

The folder does not exist.

If I click train again it creates the folder and proceeds and I get this error:

finetuned_models\commentator

>> DVAE weights restored from: H:\XTTS\xtts-webui\models\v2.0.2\dvae.pth
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\modules\train.py", line 87, in train_model
    speaker_xtts_path, config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(
  File "H:\XTTS\xtts-webui\scripts\utils\gpt_train.py", line 175, in train_gpt
    train_samples, eval_samples = load_tts_samples(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\TTS\tts\datasets\__init__.py", line 121, in load_tts_samples
    assert len(meta_data_train) > 0, f" [!] No training samples found in {root_path}/{meta_file_train}"
AssertionError:  [!] No training samples found in finetuned_models\commentator\dataset/metadata_train.csv
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\modules\train.py", line 245, in train_xtts_model
    shutil.copy(speaker_reference_path, reference_destination)
  File "C:\Program Files\Python310\lib\shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "C:\Program Files\Python310\lib\shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'finetuned_models\\commentator\\ready\\reference.wav'

If i place a wav file called reference.wav in the folder I can proceed to the next error.

Existing language matches target language
Loading Whisper Model!
Existing training metadata found and loaded.
Existing evaluation metadata found and loaded.
Dataset Processed!
finetuned_models\commentator
>> DVAE weights restored from: H:\XTTS\xtts-webui\models\v2.0.2\dvae.pth
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\modules\train.py", line 87, in train_model
    speaker_xtts_path, config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(
  File "H:\XTTS\xtts-webui\scripts\utils\gpt_train.py", line 175, in train_gpt
    train_samples, eval_samples = load_tts_samples(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\TTS\tts\datasets\__init__.py", line 121, in load_tts_samples
    assert len(meta_data_train) > 0, f" [!] No training samples found in {root_path}/{meta_file_train}"
AssertionError:  [!] No training samples found in finetuned_models\commentator\dataset/metadata_train.csv

No idea past this point.

Metadata_train.csv just includes "audio_file|text|speaker_name"

Full log including the Git pull, installer run (to make sure requirements are installed) and all the errors:

Microsoft Windows [Version 10.0.22621.3007]
(c) Microsoft Corporation. All rights reserved.

H:\XTTS\xtts-webui>git pull
Already up to date.

H:\XTTS\xtts-webui>install.bat
Requirement already satisfied: gradio==4.13.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 1)) (4.13.0)
Requirement already satisfied: torch==2.1.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 2)) (2.1.1+cu118)
Requirement already satisfied: torchaudio==2.1.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 3)) (2.1.1+cu118)
Requirement already satisfied: faster_whisper==0.10.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 4)) (0.10.0)
Requirement already satisfied: tts>=0.22.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 5)) (0.22.0)
Requirement already satisfied: langid in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 6)) (1.1.6)
Requirement already satisfied: noisereduce in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 7)) (3.0.0)
Requirement already satisfied: pedalboard in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 8)) (0.8.7)
Requirement already satisfied: pydub in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 9)) (0.25.1)
Requirement already satisfied: ffmpeg-python in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 10)) (0.2.0)
Requirement already satisfied: soundfile in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 11)) (0.12.1)
Requirement already satisfied: cutlet in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 12)) (0.3.0)
Requirement already satisfied: fugashi[unidic-lite] in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 13)) (1.3.0)
Requirement already satisfied: loguru in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 14)) (0.7.2)
Requirement already satisfied: omegaconf==2.3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 15)) (2.3.0)
Requirement already satisfied: resampy==0.4.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 16)) (0.4.2)
Requirement already satisfied: tabulate==0.8.10 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 17)) (0.8.10)
Requirement already satisfied: requests in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 18)) (2.31.0)
Requirement already satisfied: faiss-cpu in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 19)) (1.7.4)
Requirement already satisfied: pyworld in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 20)) (0.3.4)
Requirement already satisfied: torchcrepe in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 21)) (0.0.22)
Requirement already satisfied: praat-parselmouth>=0.4.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 22)) (0.4.3)
Requirement already satisfied: translators in h:\xtts\xtts-webui\venv\lib\site-packages (from -r .\requirements.txt (line 23)) (5.8.9)
Requirement already satisfied: huggingface-hub>=0.19.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.20.2)
Requirement already satisfied: pillow<11.0,>=8.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (10.2.0)
Requirement already satisfied: ffmpy in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.3.1)
Requirement already satisfied: importlib-resources<7.0,>=1.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (6.1.1)
Requirement already satisfied: aiofiles<24.0,>=22.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (23.2.1)
Requirement already satisfied: pydantic>=2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (2.5.3)
Requirement already satisfied: jinja2<4.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (3.1.2)
Requirement already satisfied: numpy~=1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (1.22.0)
Requirement already satisfied: fastapi in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.108.0)
Requirement already satisfied: tomlkit==0.12.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.12.0)
Requirement already satisfied: pandas<3.0,>=1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (1.5.3)
Requirement already satisfied: httpx in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.26.0)
Requirement already satisfied: uvicorn>=0.14.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.25.0)
Requirement already satisfied: altair<6.0,>=4.2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (5.2.0)
Requirement already satisfied: gradio-client==0.8.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.8.0)
Requirement already satisfied: semantic-version~=2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (2.10.0)
Requirement already satisfied: typing-extensions~=4.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (4.9.0)
Requirement already satisfied: packaging in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (23.2)
Requirement already satisfied: matplotlib~=3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (3.8.2)
Requirement already satisfied: python-multipart in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.0.6)
Requirement already satisfied: orjson~=3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (3.9.10)
Requirement already satisfied: markupsafe~=2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (2.1.3)
Requirement already satisfied: pyyaml<7.0,>=5.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (6.0.1)
Requirement already satisfied: typer[all]<1.0,>=0.9 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio==4.13.0->-r .\requirements.txt (line 1)) (0.9.0)
Requirement already satisfied: sympy in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1->-r .\requirements.txt (line 2)) (1.12)
Requirement already satisfied: fsspec in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1->-r .\requirements.txt (line 2)) (2023.12.2)
Requirement already satisfied: filelock in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1->-r .\requirements.txt (line 2)) (3.13.1)
Requirement already satisfied: networkx in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1->-r .\requirements.txt (line 2)) (2.8.8)
Requirement already satisfied: av==10.* in h:\xtts\xtts-webui\venv\lib\site-packages (from faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (10.0.0)
Requirement already satisfied: ctranslate2<4,>=3.22 in h:\xtts\xtts-webui\venv\lib\site-packages (from faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (3.23.0)
Requirement already satisfied: tokenizers<0.16,>=0.13 in h:\xtts\xtts-webui\venv\lib\site-packages (from faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (0.15.0)
Requirement already satisfied: onnxruntime<2,>=1.14 in h:\xtts\xtts-webui\venv\lib\site-packages (from faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (1.16.3)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in h:\xtts\xtts-webui\venv\lib\site-packages (from omegaconf==2.3.0->-r .\requirements.txt (line 15)) (4.9.3)
Requirement already satisfied: numba>=0.53 in h:\xtts\xtts-webui\venv\lib\site-packages (from resampy==0.4.2->-r .\requirements.txt (line 16)) (0.58.1)
Requirement already satisfied: websockets<12.0,>=10.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gradio-client==0.8.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (11.0.3)
Requirement already satisfied: jieba in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.42.1)
Requirement already satisfied: bnunicodenormalizer in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.1.6)
Requirement already satisfied: encodec>=0.1.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.1.1)
Requirement already satisfied: bangla in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.0.2)
Requirement already satisfied: bnnumerizer in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.0.2)
Requirement already satisfied: cython>=0.29.30 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (3.0.7)
Requirement already satisfied: aiohttp>=3.8.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (3.9.1)
Requirement already satisfied: transformers>=4.33.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (4.36.2)
Requirement already satisfied: unidecode>=1.3.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (1.3.7)
Requirement already satisfied: nltk in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (3.8.1)
Requirement already satisfied: einops>=0.6.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.7.0)
Requirement already satisfied: librosa>=0.10.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.10.0)
Requirement already satisfied: trainer>=0.0.32 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.0.36)
Requirement already satisfied: jamo in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.4.1)
Requirement already satisfied: g2pkk>=0.1.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.1.2)
Requirement already satisfied: spacy[ja]>=3 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (3.7.2)
Requirement already satisfied: scipy>=1.11.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (1.11.4)
Requirement already satisfied: umap-learn>=0.5.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.5.5)
Requirement already satisfied: flask>=2.0.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (3.0.0)
Requirement already satisfied: hangul-romanize in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.1.0)
Requirement already satisfied: gruut[de,es,fr]==2.2.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (2.2.3)
Requirement already satisfied: tqdm>=4.64.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (4.66.1)
Requirement already satisfied: coqpit>=0.0.16 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.0.17)
Requirement already satisfied: pysbd>=0.3.4 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.3.4)
Requirement already satisfied: pypinyin in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.50.0)
Requirement already satisfied: num2words in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.5.13)
Requirement already satisfied: scikit-learn>=1.3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (1.3.2)
Requirement already satisfied: anyascii>=0.3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (0.3.2)
Requirement already satisfied: inflect>=5.6.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tts>=0.22.0->-r .\requirements.txt (line 5)) (7.0.0)
Requirement already satisfied: Babel<3.0.0,>=2.8.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.14.0)
Requirement already satisfied: dateparser~=1.1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.1.8)
Requirement already satisfied: gruut-ipa<1.0,>=0.12.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.13.0)
Requirement already satisfied: gruut_lang_en~=2.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.0)
Requirement already satisfied: jsonlines~=1.2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.2.0)
Requirement already satisfied: python-crfsuite~=0.9.7 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.9.10)
Requirement already satisfied: gruut_lang_fr~=2.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.2)
Requirement already satisfied: gruut_lang_es~=2.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.0)
Requirement already satisfied: gruut_lang_de~=2.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.0)
Requirement already satisfied: future in h:\xtts\xtts-webui\venv\lib\site-packages (from ffmpeg-python->-r .\requirements.txt (line 10)) (0.18.3)
Requirement already satisfied: cffi>=1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from soundfile->-r .\requirements.txt (line 11)) (1.16.0)
Requirement already satisfied: jaconv in h:\xtts\xtts-webui\venv\lib\site-packages (from cutlet->-r .\requirements.txt (line 12)) (0.3.4)
Requirement already satisfied: mojimoji in h:\xtts\xtts-webui\venv\lib\site-packages (from cutlet->-r .\requirements.txt (line 12)) (0.0.12)
Requirement already satisfied: unidic-lite in h:\xtts\xtts-webui\venv\lib\site-packages (from fugashi[unidic-lite]->-r .\requirements.txt (line 13)) (1.0.8)
Requirement already satisfied: win32-setctime>=1.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from loguru->-r .\requirements.txt (line 14)) (1.1.0)
Requirement already satisfied: colorama>=0.3.4 in h:\xtts\xtts-webui\venv\lib\site-packages (from loguru->-r .\requirements.txt (line 14)) (0.4.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from requests->-r .\requirements.txt (line 18)) (2.1.0)
Requirement already satisfied: idna<4,>=2.5 in h:\xtts\xtts-webui\venv\lib\site-packages (from requests->-r .\requirements.txt (line 18)) (3.6)
Requirement already satisfied: charset-normalizer<4,>=2 in h:\xtts\xtts-webui\venv\lib\site-packages (from requests->-r .\requirements.txt (line 18)) (3.3.2)
Requirement already satisfied: certifi>=2017.4.17 in h:\xtts\xtts-webui\venv\lib\site-packages (from requests->-r .\requirements.txt (line 18)) (2023.11.17)
Requirement already satisfied: pathos>=0.2.9 in h:\xtts\xtts-webui\venv\lib\site-packages (from translators->-r .\requirements.txt (line 23)) (0.3.1)
Requirement already satisfied: cryptography>=38.0.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from translators->-r .\requirements.txt (line 23)) (41.0.7)
Requirement already satisfied: PyExecJS>=1.5.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from translators->-r .\requirements.txt (line 23)) (1.5.1)
Requirement already satisfied: lxml>=4.9.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from translators->-r .\requirements.txt (line 23)) (5.1.0)
Requirement already satisfied: multidict<7.0,>=4.5 in h:\xtts\xtts-webui\venv\lib\site-packages (from aiohttp>=3.8.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (6.0.4)
Requirement already satisfied: aiosignal>=1.1.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from aiohttp>=3.8.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.3.1)
Requirement already satisfied: frozenlist>=1.1.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from aiohttp>=3.8.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.4.1)
Requirement already satisfied: async-timeout<5.0,>=4.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from aiohttp>=3.8.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (4.0.3)
Requirement already satisfied: attrs>=17.3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from aiohttp>=3.8.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (23.2.0)
Requirement already satisfied: yarl<2.0,>=1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from aiohttp>=3.8.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.9.4)
Requirement already satisfied: jsonschema>=3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from altair<6.0,>=4.2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (4.20.0)
Requirement already satisfied: toolz in h:\xtts\xtts-webui\venv\lib\site-packages (from altair<6.0,>=4.2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.12.0)
Requirement already satisfied: pycparser in h:\xtts\xtts-webui\venv\lib\site-packages (from cffi>=1.0->soundfile->-r .\requirements.txt (line 11)) (2.21)
Requirement already satisfied: setuptools in h:\xtts\xtts-webui\venv\lib\site-packages (from ctranslate2<4,>=3.22->faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (63.2.0)
Requirement already satisfied: Werkzeug>=3.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from flask>=2.0.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.0.1)
Requirement already satisfied: blinker>=1.6.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from flask>=2.0.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.7.0)
Requirement already satisfied: click>=8.1.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from flask>=2.0.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (8.1.7)
Requirement already satisfied: itsdangerous>=2.1.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from flask>=2.0.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.1.2)
Requirement already satisfied: lazy-loader>=0.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.3)
Requirement already satisfied: audioread>=2.1.9 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.0.1)
Requirement already satisfied: msgpack>=1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.0.7)
Requirement already satisfied: decorator>=4.3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (5.1.1)
Requirement already satisfied: soxr>=0.3.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.3.7)
Requirement already satisfied: joblib>=0.14 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.3.2)
Requirement already satisfied: pooch>=1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.8.0)
Requirement already satisfied: fonttools>=4.22.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from matplotlib~=3.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (4.47.0)
Requirement already satisfied: pyparsing>=2.3.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from matplotlib~=3.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (3.1.1)
Requirement already satisfied: cycler>=0.10 in h:\xtts\xtts-webui\venv\lib\site-packages (from matplotlib~=3.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.12.1)
Requirement already satisfied: python-dateutil>=2.7 in h:\xtts\xtts-webui\venv\lib\site-packages (from matplotlib~=3.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (2.8.2)
Requirement already satisfied: kiwisolver>=1.3.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from matplotlib~=3.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (1.4.5)
Requirement already satisfied: contourpy>=1.0.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from matplotlib~=3.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (1.2.0)
Requirement already satisfied: docopt>=0.6.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from num2words->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.6.2)
Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in h:\xtts\xtts-webui\venv\lib\site-packages (from numba>=0.53->resampy==0.4.2->-r .\requirements.txt (line 16)) (0.41.1)
Requirement already satisfied: flatbuffers in h:\xtts\xtts-webui\venv\lib\site-packages (from onnxruntime<2,>=1.14->faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (23.5.26)
Requirement already satisfied: coloredlogs in h:\xtts\xtts-webui\venv\lib\site-packages (from onnxruntime<2,>=1.14->faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (15.0.1)
Requirement already satisfied: protobuf in h:\xtts\xtts-webui\venv\lib\site-packages (from onnxruntime<2,>=1.14->faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (4.23.4)
Requirement already satisfied: pytz>=2020.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from pandas<3.0,>=1.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (2023.3.post1)
Requirement already satisfied: pox>=0.3.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from pathos>=0.2.9->translators->-r .\requirements.txt (line 23)) (0.3.3)
Requirement already satisfied: ppft>=1.7.6.7 in h:\xtts\xtts-webui\venv\lib\site-packages (from pathos>=0.2.9->translators->-r .\requirements.txt (line 23)) (1.7.6.7)
Requirement already satisfied: dill>=0.3.7 in h:\xtts\xtts-webui\venv\lib\site-packages (from pathos>=0.2.9->translators->-r .\requirements.txt (line 23)) (0.3.7)
Requirement already satisfied: multiprocess>=0.70.15 in h:\xtts\xtts-webui\venv\lib\site-packages (from pathos>=0.2.9->translators->-r .\requirements.txt (line 23)) (0.70.15)
Requirement already satisfied: pydantic-core==2.14.6 in h:\xtts\xtts-webui\venv\lib\site-packages (from pydantic>=2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (2.14.6)
Requirement already satisfied: annotated-types>=0.4.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from pydantic>=2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.6.0)
Requirement already satisfied: six>=1.10.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from PyExecJS>=1.5.1->translators->-r .\requirements.txt (line 23)) (1.16.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from scikit-learn>=1.3.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.2.0)
Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (6.4.0)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.3.0)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.1.2)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.0.12)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.3.4)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.8)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.0.10)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.4.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.0.9)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.0.5)
Requirement already satisfied: thinc<8.3.0,>=8.1.8 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (8.2.2)
Requirement already satisfied: sudachidict-core>=20211220 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (20230927)
Requirement already satisfied: sudachipy!=0.6.1,>=0.5.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.6.8)
Requirement already satisfied: tensorboard in h:\xtts\xtts-webui\venv\lib\site-packages (from trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.15.1)
Requirement already satisfied: psutil in h:\xtts\xtts-webui\venv\lib\site-packages (from trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (5.9.7)
Requirement already satisfied: safetensors>=0.3.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from transformers>=4.33.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.4.1)
Requirement already satisfied: regex!=2019.12.17 in h:\xtts\xtts-webui\venv\lib\site-packages (from transformers>=4.33.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (2023.12.25)
Requirement already satisfied: rich<14.0.0,>=10.11.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from typer[all]<1.0,>=0.9->gradio==4.13.0->-r .\requirements.txt (line 1)) (13.7.0)
Requirement already satisfied: shellingham<2.0.0,>=1.3.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from typer[all]<1.0,>=0.9->gradio==4.13.0->-r .\requirements.txt (line 1)) (1.5.4)
Requirement already satisfied: pynndescent>=0.5 in h:\xtts\xtts-webui\venv\lib\site-packages (from umap-learn>=0.5.1->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.5.11)
Requirement already satisfied: h11>=0.8 in h:\xtts\xtts-webui\venv\lib\site-packages (from uvicorn>=0.14.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.14.0)
Requirement already satisfied: starlette<0.33.0,>=0.29.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from fastapi->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.32.0.post1)
Requirement already satisfied: anyio in h:\xtts\xtts-webui\venv\lib\site-packages (from httpx->gradio==4.13.0->-r .\requirements.txt (line 1)) (4.2.0)
Requirement already satisfied: httpcore==1.* in h:\xtts\xtts-webui\venv\lib\site-packages (from httpx->gradio==4.13.0->-r .\requirements.txt (line 1)) (1.0.2)
Requirement already satisfied: sniffio in h:\xtts\xtts-webui\venv\lib\site-packages (from httpx->gradio==4.13.0->-r .\requirements.txt (line 1)) (1.3.0)
Requirement already satisfied: mpmath>=0.19 in h:\xtts\xtts-webui\venv\lib\site-packages (from sympy->torch==2.1.1->-r .\requirements.txt (line 2)) (1.3.0)
Requirement already satisfied: tzlocal in h:\xtts\xtts-webui\venv\lib\site-packages (from dateparser~=1.1.0->gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (5.2)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in h:\xtts\xtts-webui\venv\lib\site-packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (2023.12.1)
Requirement already satisfied: referencing>=0.28.4 in h:\xtts\xtts-webui\venv\lib\site-packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.32.1)
Requirement already satisfied: rpds-py>=0.7.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from jsonschema>=3.0->altair<6.0,>=4.2.0->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.16.2)
Requirement already satisfied: platformdirs>=2.5.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from pooch>=1.0->librosa>=0.10.0->tts>=0.22.0->-r .\requirements.txt (line 5)) (4.1.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from rich<14.0.0,>=10.11.0->typer[all]<1.0,>=0.9->gradio==4.13.0->-r .\requirements.txt (line 1)) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from rich<14.0.0,>=10.11.0->typer[all]<1.0,>=0.9->gradio==4.13.0->-r .\requirements.txt (line 1)) (2.17.2)
Requirement already satisfied: exceptiongroup>=1.0.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from anyio->httpx->gradio==4.13.0->-r .\requirements.txt (line 1)) (1.2.0)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in h:\xtts\xtts-webui\venv\lib\site-packages (from thinc<8.3.0,>=8.1.8->spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from thinc<8.3.0,>=8.1.8->spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.1.4)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from weasel<0.4.0,>=0.1.0->spacy[ja]>=3->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.16.0)
Requirement already satisfied: humanfriendly>=9.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from coloredlogs->onnxruntime<2,>=1.14->faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (10.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.7.2)
Requirement already satisfied: markdown>=2.6.8 in h:\xtts\xtts-webui\venv\lib\site-packages (from tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.5.1)
Requirement already satisfied: google-auth<3,>=1.6.3 in h:\xtts\xtts-webui\venv\lib\site-packages (from tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.26.1)
Requirement already satisfied: absl-py>=0.4 in h:\xtts\xtts-webui\venv\lib\site-packages (from tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (2.0.0)
Requirement already satisfied: grpcio>=1.48.2 in h:\xtts\xtts-webui\venv\lib\site-packages (from tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.60.0)
Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in h:\xtts\xtts-webui\venv\lib\site-packages (from tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.2.0)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (5.3.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in h:\xtts\xtts-webui\venv\lib\site-packages (from google-auth<3,>=1.6.3->tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from google-auth-oauthlib<2,>=0.5->tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (1.3.1)
Requirement already satisfied: pyreadline3 in h:\xtts\xtts-webui\venv\lib\site-packages (from humanfriendly>=9.1->coloredlogs->onnxruntime<2,>=1.14->faster_whisper==0.10.0->-r .\requirements.txt (line 4)) (3.4.1)
Requirement already satisfied: mdurl~=0.1 in h:\xtts\xtts-webui\venv\lib\site-packages (from markdown-it-py>=2.2.0->rich<14.0.0,>=10.11.0->typer[all]<1.0,>=0.9->gradio==4.13.0->-r .\requirements.txt (line 1)) (0.1.2)
Requirement already satisfied: tzdata in h:\xtts\xtts-webui\venv\lib\site-packages (from tzlocal->dateparser~=1.1.0->gruut[de,es,fr]==2.2.3->tts>=0.22.0->-r .\requirements.txt (line 5)) (2023.4)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in h:\xtts\xtts-webui\venv\lib\site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (0.5.1)
Requirement already satisfied: oauthlib>=3.0.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard->trainer>=0.0.32->tts>=0.22.0->-r .\requirements.txt (line 5)) (3.2.2)

[notice] A new release of pip available: 22.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip
Looking in indexes: https://download.pytorch.org/whl/cu118
Requirement already satisfied: torch==2.1.1+cu118 in h:\xtts\xtts-webui\venv\lib\site-packages (2.1.1+cu118)
Requirement already satisfied: torchaudio==2.1.1+cu118 in h:\xtts\xtts-webui\venv\lib\site-packages (2.1.1+cu118)
Requirement already satisfied: networkx in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1+cu118) (2.8.8)
Requirement already satisfied: jinja2 in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1+cu118) (3.1.2)
Requirement already satisfied: fsspec in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1+cu118) (2023.12.2)Requirement already satisfied: sympy in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1+cu118) (1.12)
Requirement already satisfied: typing-extensions in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1+cu118) (4.9.0)
Requirement already satisfied: filelock in h:\xtts\xtts-webui\venv\lib\site-packages (from torch==2.1.1+cu118) (3.13.1)
Requirement already satisfied: MarkupSafe>=2.0 in h:\xtts\xtts-webui\venv\lib\site-packages (from jinja2->torch==2.1.1+cu118) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in h:\xtts\xtts-webui\venv\lib\site-packages (from sympy->torch==2.1.1+cu118) (1.3.0)

[notice] A new release of pip available: 22.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip
Install deepspeed for windows for python 3.10.x and CUDA 11.8
Install complete.
Press any key to continue . . .

(venv) H:\XTTS\xtts-webui>start_xtts_webui.bat
2024-01-14 21:14:53.628 | INFO     | xtts_webui:<module>:57 - Start loading model v2.0.2
2024-01-14 21:14:53.628 | INFO     | xtts_webui:<module>:60 - this dir: H:\XTTS\xtts-webui
[2024-01-14 21:15:04,963] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-14 21:15:05,143] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
[2024-01-14 21:15:05,325] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.11.2+unknown, git-hash=unknown, git-branch=unknown
[2024-01-14 21:15:05,326] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference
[2024-01-14 21:15:05,326] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2024-01-14 21:15:05,327] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
[2024-01-14 21:15:05,520] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed-Inference config: {'layer_id': 0, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'num_hidden_layers': -1, 'dtype': torch.float32, 'pre_layer_norm': True, 'norm_type': <NormType.LayerNorm: 1>, 'local_rank': -1, 'stochastic_mode': False, 'epsilon': 1e-05, 'mp_size': 1, 'scale_attention': True, 'triangular_masking': True, 'local_attention': False, 'window_size': 1, 'rotary_dim': -1, 'rotate_half': False, 'rotate_every_two': True, 'return_tuple': True, 'mlp_after_attn': True, 'mlp_act_func_type': <ActivationFuncType.GELU: 1>, 'specialized_mode': False, 'training_mp_size': 1, 'bigscience_bloom': False, 'max_out_tokens': 1024, 'min_out_tokens': 1, 'scale_attn_by_inverse_layer_idx': False, 'enable_qkv_quantization': False, 'use_mup': False, 'return_single_tuple': False, 'set_empty_params': False, 'transposed_mode': False, 'use_triton': False, 'triton_autotune': False, 'num_kv': -1, 'rope_theta': 10000}
2024-01-14 21:15:05.974 | INFO     | scripts.tts_funcs:load_model:99 - Pre-create latents for all current speakers
2024-01-14 21:15:05.975 | INFO     | scripts.tts_funcs:get_or_create_latents:173 - creating latents for calm_female: speakers/calm_female.wav
H:\XTTS\xtts-webui\venv\lib\site-packages\torchaudio\functional\functional.py:147: UserWarning: Specified kernel cache directory could not be created! This disables kernel caching. Specified directory is temp/torch/kernels. This warning will appear only once per process. (Triggered internally at ..\aten\src\ATen\native\cuda\jit_utils.cpp:1444.)
  return spec_f.abs().pow(power)
2024-01-14 21:15:07.326 | INFO     | scripts.tts_funcs:get_or_create_latents:173 - creating latents for davidatt: speakers/davidatt.wav
2024-01-14 21:15:07.456 | INFO     | scripts.tts_funcs:get_or_create_latents:173 - creating latents for female: speakers/female.wav
2024-01-14 21:15:07.544 | INFO     | scripts.tts_funcs:get_or_create_latents:173 - creating latents for male: speakers/male.wav
2024-01-14 21:15:07.623 | INFO     | scripts.tts_funcs:get_or_create_latents:173 - creating latents for MorganFreeman: speakers/MorganFreeman.wav
2024-01-14 21:15:07.721 | INFO     | scripts.tts_funcs:get_or_create_latents:173 - creating latents for NarratorNew: ['speakers/NarratorNew\\reference.wav']
2024-01-14 21:15:07.790 | INFO     | scripts.tts_funcs:create_latents_for_all:187 - Latents created for all 6 speakers.
2024-01-14 21:15:07.790 | INFO     | scripts.tts_funcs:load_model:103 - Model successfully loaded
Running on local URL:  http://127.0.0.1:8010

To create a public link, set `share=True` in `launch()`.
2024-01-14 21:17:32.187 | INFO     | scripts.tts_funcs:unload_model:79 - Model unloaded
Warning, existing language does not match target language. Updated lang.txt with target language.
Loading Whisper Model!
The sum of the duration of the audios that you provided should be at least 2 minutes!
finetuned_models\commentator
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\modules\train.py", line 239, in train_xtts_model
    shutil.copytree(str(ready_folder), str(
  File "C:\Program Files\Python310\lib\shutil.py", line 556, in copytree
    with os.scandir(src) as itr:
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'finetuned_models\\commentator\\ready'
2024-01-14 21:23:12.341 | INFO     | scripts.tts_funcs:unload_model:79 - Model unloaded
Existing language matches target language
Loading Whisper Model!
Existing training metadata found and loaded.
Existing evaluation metadata found and loaded.
Dataset Processed!
finetuned_models\commentator
>> DVAE weights restored from: H:\XTTS\xtts-webui\models\v2.0.2\dvae.pth
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\modules\train.py", line 87, in train_model
    speaker_xtts_path, config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(
  File "H:\XTTS\xtts-webui\scripts\utils\gpt_train.py", line 175, in train_gpt
    train_samples, eval_samples = load_tts_samples(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\TTS\tts\datasets\__init__.py", line 121, in load_tts_samples
    assert len(meta_data_train) > 0, f" [!] No training samples found in {root_path}/{meta_file_train}"
AssertionError:  [!] No training samples found in finetuned_models\commentator\dataset/metadata_train.csv
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\modules\train.py", line 245, in train_xtts_model
    shutil.copy(speaker_reference_path, reference_destination)
  File "C:\Program Files\Python310\lib\shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "C:\Program Files\Python310\lib\shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'finetuned_models\\commentator\\ready\\reference.wav'
2024-01-14 21:24:54.746 | INFO     | scripts.tts_funcs:unload_model:79 - Model unloaded
Existing language matches target language
Loading Whisper Model!
Existing training metadata found and loaded.
Existing evaluation metadata found and loaded.
Dataset Processed!
finetuned_models\commentator
>> DVAE weights restored from: H:\XTTS\xtts-webui\models\v2.0.2\dvae.pth
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\modules\train.py", line 87, in train_model
    speaker_xtts_path, config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(
  File "H:\XTTS\xtts-webui\scripts\utils\gpt_train.py", line 175, in train_gpt
    train_samples, eval_samples = load_tts_samples(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\TTS\tts\datasets\__init__.py", line 121, in load_tts_samples
    assert len(meta_data_train) > 0, f" [!] No training samples found in {root_path}/{meta_file_train}"
AssertionError:  [!] No training samples found in finetuned_models\commentator\dataset/metadata_train.csv
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\queueing.py", line 489, in call_prediction
    output = await route_utils.call_process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1561, in process_api
    result = await self.call_function(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\blocks.py", line 1179, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\gradio\utils.py", line 678, in wrapper
    response = f(*args, **kwargs)
  File "H:\XTTS\xtts-webui\modules\train.py", line 245, in train_xtts_model
    shutil.copy(speaker_reference_path, reference_destination)
  File "C:\Program Files\Python310\lib\shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "C:\Program Files\Python310\lib\shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'finetuned_models\\commentator\\ready\\reference.wav'
2024-01-14 21:28:26.131 | INFO     | scripts.tts_funcs:unload_model:79 - Model unloaded
Existing language matches target language
Loading Whisper Model!
Existing training metadata found and loaded.
Existing evaluation metadata found and loaded.
Dataset Processed!
finetuned_models\commentator
>> DVAE weights restored from: H:\XTTS\xtts-webui\models\v2.0.2\dvae.pth
Traceback (most recent call last):
  File "H:\XTTS\xtts-webui\modules\train.py", line 87, in train_model
    speaker_xtts_path, config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(
  File "H:\XTTS\xtts-webui\scripts\utils\gpt_train.py", line 175, in train_gpt
    train_samples, eval_samples = load_tts_samples(
  File "H:\XTTS\xtts-webui\venv\lib\site-packages\TTS\tts\datasets\__init__.py", line 121, in load_tts_samples
    assert len(meta_data_train) > 0, f" [!] No training samples found in {root_path}/{meta_file_train}"
AssertionError:  [!] No training samples found in finetuned_models\commentator\dataset/metadata_train.csv

Txt batch processing issues (text2voice)

I don't know if it's something on my end, but since I started using xtts-webui I never was able to use that option, I get this error:

Otherwise great work with this tool, I've been following some TTS projects and it's great to see the results so far. Let me know if I can provide any extra details about my issue. Thanks.

Enhancement Suggestions

First, thank you for always working on this project, and sorry for bothering you again.

I wanted to suggest new enhancements.

The first one is for Whisper translation: can it do it with aligning? Automatically syncing the newly created translated audio to the original voice part in order to use this as auto auto-dubbing tool?

Secondly, "Add the ability to customize speakers when batch processing" is already on the to-do list. Would adding simple command prompts inside the default input text window (not batch process) be possible? Like giving speaker or advanced setting prompts before lines:

{Adam, temp:0.75} How are you?
{Daniel, temp:0.5} Fine.

So a kind of live-batch process without creating different text files. This would be a wonderful QoL upgrade. Yes, we can do it by manually splitting every paragraph into different text files but this would be much easier to add {speaker} before required parts or so...

It would also be great to have these:
-Ability to add silences with prompts like: {0.5s},
-Ability to split output by prompts in input text window like {split} or so,
-Postprocess audio edit page to merge batch parts with settings like silence generation.

Thank you so much for your great work!

Failed to load PyTorch C extensions

I am getting this error when i start "start_xtts_webui.bat":

Traceback (most recent call last):
File "C:\Users\jadso\xtts-webui\app.py", line 67, in
from xtts_webui import demo
File "C:\Users\jadso\xtts-webui\xtts_webui.py", line 3, in
from scripts.tts_funcs import TTSWrapper
File "C:\Users\jadso\xtts-webui\scripts\tts_funcs.py", line 3, in
import torch
File "C:\Users\jadso\xtts-webui\venv\lib\site-packages\torch_init_.py", line 451, in
raise ImportError(textwrap.dedent('''
ImportError: Failed to load PyTorch C extensions:
It appears that PyTorch has loaded the torch/_C folder
of the PyTorch repository rather than the C extensions which
are expected in the torch._C namespace. This can occur when
using the install workflow. e.g.
$ python setup.py install && python -c "import torch"

This error can generally be solved using the `develop` workflow
    $ python setup.py develop && python -c "import torch"  # This should succeed
or by running Python from a different directory.

i tried to reinstall PyTorch, Cuda, everything, but didnt work.
What am i doing wrong?

RVC wrong path

The code provides an incorrect path for RVC.
The file I received is named 'output/bezi_(1)_bezi.wav,' but RVC expects the name 'output/bezi_bezi_1.wav.'"

Suggestion for batch input

Well, an option to batch input as lines instead of a bunch of txt files is necessary and way more convenient in my opinion.
For example:
line 1 -> 1.wav
line 2 ->2 .wav
....
Oh, and if we can assign voice name for each line/output, it would be great!

Anyway, thank you so much for creating this <3

can't run the webui

Installation went without error but when I try to run the webui:

Traceback (most recent call last):
File "C:\Users\Paul\Documents\Coding\xtts\xtts-webui\app.py", line 67, in
from xtts_webui import demo
File "C:\Users\Paul\Documents\Coding\xtts\xtts-webui\xtts_webui.py", line 3, in
from scripts.tts_funcs import TTSWrapper
File "C:\Users\Paul\Documents\Coding\xtts\xtts-webui\scripts\tts_funcs.py", line 3, in
import torch
File "C:\Users\Paul\Documents\Coding\xtts\xtts-webui\venv\Lib\site-packages\torch_init_.py", line 139, in
raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\Paul\Documents\Coding\xtts\xtts-webui\venv\Lib\site-packages\torch\lib\torch_python.dll" or one of its dependencies

No such file or directory: \\xtts-webui\\models\\v2.0.2\\speakers_xtts.pth' ", "", "", "", ""

I get this error when trying to train using v2.0.2 or .3. The only one that works is using Main (also thanks this is awesome)

Resemble enhancement (Reference Speaker)

Selecting "Resemble enhancement" for reference gives an error but otherwise works fine without it. FFMPEG is installed:

Different UI depending if i launch on virtual enviroment or from the bat file.

I ran into a weird issue.
Basically there is no RVC options or functionality if i launch app.py from the virtual environment rather than from the bat file.
The app.py launched by .bat does have RVC options/functionality.

I made sure to install all the requirements and pytorch in the virtual enviroment.

This is how the UI looks launched from virtual enviroment (no RVC tab):

This is how it looks launched with .bat file (RVC tab appears):

What could be the issue?

conflicting depencies

I have python 3.10

Cannot load webui via Colab

Error in the last step:

shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
/bin/bash: line 1: lt.log: No such file or directory

Have tried a few times. Anyone can help?

new language training possible?

can we train new language with this?

Fine-tuning Error - FileNotFoundError: [Errno 2] No such file or directory

Firstly, thank you very much for the project; I'm learning a lot from it.

I'm trying to execute the fine-tuning, however, I'm receiving the following error:

`Train csv and eval csv already exists
finetuned_models\finetune_primo

DVAE weights restored from: C:\Users\wylken\Documents\Projetos\xtts-webui\models\v2.0.2\dvae.pth
Traceback (most recent call last):
File "C:\Users\wylken\Documents\Projetos\xtts-webui\modules\train.py", line 87, in train_model
speaker_xtts_path, config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(
^^^^^^^^^^
File "C:\Users\wylken\Documents\Projetos\xtts-webui\scripts\utils\gpt_train.py", line 175, in train_gpt
train_samples, eval_samples = load_tts_samples(
^^^^^^^^^^^^^^^^^
File "C:\Users\wylken\Documents\Projetos\xtts-webui\venv\Lib\site-packages\TTS\tts\datasets_init_.py", line 120, in load_tts_samples
meta_data_train = formatter(root_path, meta_file_train, ignored_speakers=ignored_speakers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\wylken\Documents\Projetos\xtts-webui\venv\Lib\site-packages\TTS\tts\datasets\formatters.py", line 59, in coqui
with open(filepath, "r", encoding="utf8") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'finetuned_models\finetune_primo\dataset\metadata_train.csv'`

When using the fine-tuning web-ui, I get an error when trying to process WAVs

Hello! I am trying to fine-tune xTTS-v2 on about 300 english clips, and I chose the small Whisper model , when I clicked on the "Create dataset" button, I get this error

Traceback (most recent call last):██████████████████████████████████████████████████| 484M/484M [00:58<00:00, 7.64MB/s]
File "C:\Users\lccji\Downloads\xtts-webui-main\xtts_finetune_webui.py", line 252, in preprocess_dataset
train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, whisper_model = whisper_model, target_language=language, out_path=out_path, gradio_progress=progress)
File "C:\Users\lccji\Downloads\xtts-webui-main\scripts\utils\formatter.py", line 111, in format_audio_list
segments = list(segments)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\faster_whisper\transcribe.py", line 941, in restore_speech_timestamps
for segment in segments:
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\faster_whisper\transcribe.py", line 445, in generate_segments
encoder_output = self.encode(segment)
File "C:\Users\lccji\Downloads\xtts-webui-main\venv\lib\site-packages\faster_whisper\transcribe.py", line 629, in encode
return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library cublas64_11.dll is not found or cannot be loaded

Error when using rvc

Hello,

I have an error when trying to use rvc

"FileNotFoundError: [Errno 2] No such file or directory: 'H:\xtts-webui-main\output\female_Scarlett_1.wav'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "H:\xtts-webui-main\venv\Lib\site-packages\gradio\queueing.py", line 497, in process_events
response = await self.call_prediction(awake_events, batch)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "H:\xtts-webui-main\venv\Lib\site-packages\gradio\queueing.py", line 468, in call_prediction
raise Exception(str(error) if show_error else None) from error
Exception: None"
The file name output i've got in the ouput folder when trying to generate was "output_(1)_female.wav" using the reference speaker "female"

Can't start webui

As the title says, I cannot start the webui. Honestly not sure where to go from here. Here is the error:

'deepspeed' already installed.
2023-12-16 01:22:36.921 | INFO     | xtts_webui:<module>:64 - Start loading model v2.0.2
G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\pydantic\_internal\_config.py:321: UserWarning: Valid config keys have changed in V2:
* 'allow_population_by_field_name' has been renamed to 'populate_by_name'
* 'validate_all' has been renamed to 'validate_default'
  warnings.warn(message, UserWarning)
G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\pydantic\_internal\_fields.py:149: UserWarning: Field "model_persistence_threshold" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\pydantic\_internal\_config.py:321: UserWarning: Valid config keys have changed in V2:
* 'validate_all' has been renamed to 'validate_default'
  warnings.warn(message, UserWarning)
Traceback (most recent call last):
  File "G:\More_AI\XTTS COMBO\xtts-webui\app.py", line 28, in <module>
    from xtts_webui import demo
  File "G:\More_AI\XTTS COMBO\xtts-webui\xtts_webui.py", line 66, in <module>
    XTTS.load_model(this_dir)
  File "G:\More_AI\XTTS COMBO\xtts-webui\scripts\tts_funcs.py", line 86, in load_model
    self.load_local_model(this_dir)
  File "G:\More_AI\XTTS COMBO\xtts-webui\scripts\tts_funcs.py", line 106, in load_local_model
    self.model.load_checkpoint(config,use_deepspeed=USE_DEEPSPEED, checkpoint_dir=str(checkpoint_dir))
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\TTS\tts\models\xtts.py", line 783, in load_checkpoint
    self.gpt.init_gpt_for_inference(kv_cache=self.args.kv_cache, use_deepspeed=use_deepspeed)
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\TTS\tts\layers\xtts\gpt.py", line 222, in init_gpt_for_inference
    import deepspeed
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\__init__.py", line 15, in <module>
    from . import module_inject
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\module_inject\__init__.py", line 3, in <module>
    from .replace_module import replace_transformer_layer, revert_transformer_layer, ReplaceWithTensorSlicing, GroupQuantizer, generic_injection
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\module_inject\replace_module.py", line 803, in <module>
    from ..pipe import PipelineModule
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\pipe\__init__.py", line 3, in <module>
    from ..runtime.pipe import PipelineModule, LayerSpec, TiedLayerSpec
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\runtime\pipe\__init__.py", line 3, in <module>
    from .module import PipelineModule, LayerSpec, TiedLayerSpec
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\runtime\pipe\module.py", line 16, in <module>
    from ..activation_checkpointing import checkpointing
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\runtime\activation_checkpointing\checkpointing.py", line 25, in <module>
    from deepspeed.runtime.config import DeepSpeedConfig
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\runtime\config.py", line 30, in <module>
    from ..monitor.config import get_monitor_config
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\monitor\config.py", line 70, in <module>
    class DeepSpeedMonitorConfig(DeepSpeedConfigModel):
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\deepspeed\monitor\config.py", line 82, in DeepSpeedMonitorConfig
    def check_enabled(cls, values):
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\pydantic\deprecated\class_validators.py", line 231, in root_validator
    return root_validator()(*__args)  # type: ignore
  File "G:\More_AI\XTTS COMBO\xtts-webui\venv\lib\site-packages\pydantic\deprecated\class_validators.py", line 237, in root_validator
    raise PydanticUserError(
pydantic.errors.PydanticUserError: If you use `@root_validator` with pre=False (the default) you MUST specify `skip_on_failure=True`. Note that `@root_validator` is deprecated and should be replaced with `@model_validator`.

For further information visit https://errors.pydantic.dev/2.5/u/root-validator-pre-skip
Press any key to continue . . .

Colab Notebook OSError/

Launching the webui gives me this error:

OSError: libcudart.so.11.0: cannot open shared object file: No such file or directory

FileNotFoundError: [Errno 2] No such file or directory: '/app/xtts_api_server/models/v2.0.2/config.json'

So, I'm basically trying to deploy xtts but I get the error below:
FileNotFoundError: [Errno 2] No such file or directory: '/app/xtts_api_server/models/v2.0.2/config.json'

Here's my full log:
docker compose up

[+] Running 0/1
 ⠿ xttsapiserver Error                                                                                                                                                                   2.4s
[+] Building 751.2s (13/13) FINISHED                                                                                                                                                          
 => [internal] load build definition from Dockerfile                                                                                                                                     0.3s
 => => transferring dockerfile: 1.16kB                                                                                                                                                   0.1s
 => [internal] load .dockerignore                                                                                                                                                        0.3s
 => => transferring context: 2B                                                                                                                                                          0.1s
 => [internal] load metadata for docker.io/nvidia/cuda:12.1.0-runtime-ubuntu22.04                                                                                                        1.8s
 => [stage-0 1/8] FROM docker.io/nvidia/cuda:12.1.0-runtime-ubuntu22.04@sha256:402700b179eb764da6d60d99fe106aa16c36874f7d7fb3e122251ff6aea8b2f7                                          0.0s
 => [internal] load build context                                                                                                                                                        0.5s
 => => transferring context: 5.67MB                                                                                                                                                      0.4s
 => CACHED [stage-0 2/8] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,rw apt-get update &&     apt-get install --no-install-recommends -y python3-dev portaudio19-dev li  0.0s
 => CACHED [stage-0 3/8] RUN --mount=type=cache,target=/root/.cache/pip,rw pip3 install virtualenv                                                                                       0.0s
 => CACHED [stage-0 4/8] RUN --mount=type=cache,target=/root/.cache/pip,rw     pip3 install --upgrade pip setuptools wheel ninja &&     pip3 install torch torchvision torchaudio xform  0.0s
 => CACHED [stage-0 5/8] RUN mkdir /app                                                                                                                                                  0.0s
 => CACHED [stage-0 6/8] WORKDIR /app                                                                                                                                                    0.0s
 => [stage-0 7/8] ADD . /app                                                                                                                                                             0.8s
 => [stage-0 8/8] RUN --mount=type=cache,target=/root/.cache/pip,rw     pip3 install /app                                                                                              741.0s
 => exporting to image                                                                                                                                                                   6.7s
 => => exporting layers                                                                                                                                                                  6.7s
 => => writing image sha256:572578d8657a5e9f6eb1232f92dd7ebbf34f3a585cddc6db08cc6f0b909f6147                                                                                             0.0s 
 => => naming to docker.io/library/xttsapiserver                                                                                                                                         0.0s 
[+] Running 2/2                                                                                                                                                                               
 ⠿ Network docker_default            Created                                                                                                                                             0.2s 
 ⠿ Container docker-xttsapiserver-1  Created                                                                                                                                             0.9s
Attaching to docker-xttsapiserver-1
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | ==========
docker-xttsapiserver-1  | == CUDA ==
docker-xttsapiserver-1  | ==========
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | CUDA Version 12.1.0
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
docker-xttsapiserver-1  | By pulling and using the container, you accept the terms and conditions of this license:
docker-xttsapiserver-1  | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | *************************
docker-xttsapiserver-1  | ** DEPRECATION NOTICE! **
docker-xttsapiserver-1  | *************************
docker-xttsapiserver-1  | THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
docker-xttsapiserver-1  |     https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md
docker-xttsapiserver-1  | 
docker-xttsapiserver-1  | 2024-01-01 18:31:15.026 | INFO     | xtts_api_server.tts_funcs:create_directories:221 - Folder in the path /app/output has been created
docker-xttsapiserver-1  | 2024-01-01 18:31:15.026 | INFO     | xtts_api_server.tts_funcs:create_directories:221 - Folder in the path /app/models has been created
docker-xttsapiserver-1  | 2024-01-01 18:31:15.028 | INFO     | xtts_api_server.server:<module>:76 - Model: 'v2.0.2' starts to load,wait until it loads
docker-xttsapiserver-1  | [XTTS] Downloading config.json...
100%|██████████| 4.36k/4.36k [00:00<00:00, 4.71MiB/s]
docker-xttsapiserver-1  | [XTTS] Downloading model.pth...
100%|██████████| 1.86G/1.86G [16:11<00:00, 1.92MiB/s]   
docker-xttsapiserver-1  | [XTTS] Downloading vocab.json...
100%|██████████| 335k/335k [00:00<00:00, 1.43MiB/s]
docker-xttsapiserver-1  | Traceback (most recent call last):
docker-xttsapiserver-1  |   File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
docker-xttsapiserver-1  |     return _run_code(code, main_globals, None,
docker-xttsapiserver-1  |   File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
docker-xttsapiserver-1  |     exec(code, run_globals)
docker-xttsapiserver-1  |   File "/app/xtts_api_server/__main__.py", line 46, in <module>
docker-xttsapiserver-1  |     from xtts_api_server.server import app
docker-xttsapiserver-1  |   File "/app/xtts_api_server/server.py", line 77, in <module>
docker-xttsapiserver-1  |     XTTS.load_model()
docker-xttsapiserver-1  |   File "/app/xtts_api_server/tts_funcs.py", line 146, in load_model
docker-xttsapiserver-1  |     self.model = TTS(model_path=checkpoint_dir,config_path=config_path).to(self.device)
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/TTS/api.py", line 70, in __init__
docker-xttsapiserver-1  |     self.config = load_config(config_path) if config_path else None
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/TTS/config/__init__.py", line 91, in load_config
docker-xttsapiserver-1  |     with fsspec.open(config_path, "r", encoding="utf-8") as f:
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/fsspec/core.py", line 103, in __enter__
docker-xttsapiserver-1  |     f = self.fs.open(self.path, mode=mode)
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/fsspec/spec.py", line 1295, in open
docker-xttsapiserver-1  |     f = self._open(
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/fsspec/implementations/local.py", line 180, in _open
docker-xttsapiserver-1  |     return LocalFileOpener(path, mode, fs=self, **kwargs)
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/fsspec/implementations/local.py", line 302, in __init__
docker-xttsapiserver-1  |     self._open()
docker-xttsapiserver-1  |   File "/usr/local/lib/python3.10/dist-packages/fsspec/implementations/local.py", line 307, in _open
docker-xttsapiserver-1  |     self.f = open(self.path, mode=self.mode)
docker-xttsapiserver-1  | FileNotFoundError: [Errno 2] No such file or directory: '/app/xtts_api_server/models/v2.0.2/config.json'
docker-xttsapiserver-1 exited with code 1

Error when using finetune to create a dataset.

OS: Windows 10 LTSC.
GPU: Nvidia RTX 3060 (12GB).

When i try to create a dataset with start_xtts_finetune_webui.bat i get the following error:

Existing language matches target language
Loading Whisper Model!
Traceback (most recent call last):
  File "C:\Users\PC\Desktop\xtts-webui\xtts_finetune_webui.py", line 246, in preprocess_dataset
    train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, whisper_model = whisper_model, target_language=language, out_path=out_path, gradio_progress=progress)
  File "C:\Users\PC\Desktop\xtts-webui\scripts\utils\formatter.py", line 77, in format_audio_list
    asr_model = WhisperModel(whisper_model, device=device, compute_type="float16")
  File "C:\Users\PC\Desktop\xtts-webui\venv\lib\site-packages\faster_whisper\transcribe.py", line 130, in __init__
    self.model = ctranslate2.models.Whisper(
ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.

What could the problem be? Does my GPU not meet the criteria or something? That makes no sense, the RTX 3060 DOES support float16.
I can clone voices and do inferences, but if i try a finetune i get this error.

Losing whole sentences when generating the wav file

We are losing whole sentences or the end of sentences when we generate the audio. At first I thought it to be a training issue, but if you regenerate then you can get the sentence it missed last time back fine only for it to lose another one.

Here is an example of the text we are putting into it:

Ah, a most intriguing premise! Let us journey to the lively realm of Krystara, where the kingdom of Fowlmore flourished under the rule of King Cluckington the Wise. In this realm, chickens were revered as royalty, and their subjects, humans, were mere servants.

Now, in the heart of Fowlmore, there lived a most peculiar human named Sammy. She had a deep affection for chickens, so much so that she owned a magnificent chicken coop she called "Coop Castle." Sammy could communicate with her feathered friends, understanding their pecks and clucks as if they spoke English.

One day, Sammy stumbled upon a mysterious egg in Coop Castle. As she held it gently in her hands, a voice echoed within her mind, "Sammy, your destiny lies with the power of the Squawkstone." She was bewildered but excited by this revelation.

As the days passed, Sammy discovered that the egg hatched into a peculiar chick named Chirpity McChirpface. This chick possessed an uncanny ability to rally chickens from across Fowlmore. Sammy, intrigued by this newfound power, decided to train Chirpity and his growing army of loyal feathered followers.

And so, Sammy's chicken army began to grow in size and strength, with each recruit swearing allegiance to their fearless leader. They practiced their battle cries, perfecting their formations, and learning tactics from Sammy's extensive knowledge of chicken behavior.

One fateful day, Sammy received a vision from the mysterious voice within her mind. It urged her to march her chicken army to the gates of Fowlmore Castle and demand King Cluckington's surrender. Sammy, filled with determination, rallied her troops and set off on their grand quest.

As they approached the castle gates, Sammy's chicken army was met with astonishment by the royal guards. The sight of thousands of chickens marching in perfect formation left them bewildered. The guards hastily reported this unusual event to King Cluckington, who was both amused and alarmed by the prospect of a chicken rebellion.

King Cluckington summoned his wisest advisors to discuss this unforeseen threat. After much deliberation, they devised a plan to welcome Sammy and her army into the castle courtyard for peaceful negotiations. Little did they know that Sammy's true intentions were far more comedic than catastrophic.

When Sammy entered the castle courtyard, she was greeted by King Cluckington himself, who bowed low in respect before his unexpected visitors. Sammy, with a mischievous grin, presented her demands: a single grain of corn for every chicken in her army. The court erupted in laughter as King Cluckington agreed to her terms, knowing full well that Sammy's army consisted only of chickens who loved corn.

Thus, Sammy's attempt to take over the world through her chicken army turned out to be a hilarious farce. Instead of a coup, she had secured a lifetime supply of corn for her beloved feathered friends. King Cluckington declared Sammy an honorary citizen of Fowlmore and bestowed upon her the title of "Chicken Whisperer."

From that day forth, Sammy continued to live peacefully in Fowlmore, sharing her unique bond with chickens and spreading laughter and joy wherever she went.

Would you like to hear more about Sammy's adventures with her chicken army or perhaps a different tale altogether?

In the first gen it missed: "Sammy, your destiny lies with the power of the Squawkstone." She was bewildered but excited by this revelation.

in the second gen it missed: King Cluckington declared Sammy an honorary citizen of Fowlmore and bestowed upon her the title of "Chicken Whisperer."

RVC not working

i keep getting this weird error despite having my mlodel in a folder in the v2v rvc folder

Traceback (most recent call last):
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\queueing.py", line 489, in call_prediction
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\blocks.py", line 1570, in process_api
data = self.postprocess_data(fn_index, result["prediction"], state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\blocks.py", line 1456, in postprocess_data
outputs_cached = processing_utils.move_files_to_cache(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\processing_utils.py", line 265, in move_files_to_cache
return client_utils.traverse(data, _move_to_cache, client_utils.is_file_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio_client\utils.py", line 923, in traverse
new_obj[key] = traverse(value, func, is_root)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio_client\utils.py", line 919, in traverse
return func(json_obj)
^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\processing_utils.py", line 257, in _move_to_cache
temp_file_path = move_resource_to_block_cache(payload.path, block)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\processing_utils.py", line 234, in move_resource_to_block_cache
return block.move_resource_to_block_cache(url_or_file_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\blocks.py", line 258, in move_resource_to_block_cache
temp_file_path = processing_utils.save_file_to_cache(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\processing_utils.py", line 170, in save_file_to_cache
temp_dir = hash_file(file_path)
^^^^^^^^^^^^^^^^^^^^
File "C:\aistuffsies\xttsgui\xtts-webui\venv\Lib\site-packages\gradio\processing_utils.py", line 102, in hash_file
with open(file_path, "rb") as f:
^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'C:\aistuffsies\xttsgui\xtts-webui\output\rvc_mitest_20240213_183208.wav'

More Information - Issue about file names, appearance of playable .wav in Gradio

So the image of the soundwave and the playable .wav itself don't show up, I wasn't able to fix it. But I did make some alterations to the naming system -- you can add in the UUID information if you need it but for my personal uses, I found this format (there's an extra input field that allows you to index you sound files how you want -- really important for stitching many files together) (also, added in a versioning count because often the way something is written needs to edited once and again for the spoken version to sound right, so a count system was added.

part one edited:

#gen1
def generate_audio(text, languages, speaker_value_text, speaker_path_text, additional_text, temperature, length_penalty, repetition_penalty, top_k, top_p, speed, sentence_split):

ref_speaker_wav = ""

if speaker_path_text and speaker_value_text == "reference":
    ref_speaker_wav = speaker_path_text 
else:
    ref_speaker_wav = speaker_value_text 

lang_code = reversed_supported_languages[languages]
options = {
    "temperature": temperature,
    "length_penalty": length_penalty,
    "repetition_penalty": float(repetition_penalty),
    "top_k": top_k,
    "top_p": top_p,
    "speed": speed,
    "sentence_split": sentence_split,
}

# Check if the file already exists, if yes, add a number to the filename
count = 1
output_file_path = f"{additional_text}_({count})_{speaker_value_text}.wav"
while os.path.exists(os.path.join('output', output_file_path)):
    count += 1
    output_file_path = f"{additional_text}_({count})_{speaker_value_text}.wav"

# Perform TTS and save to the generated filename
output_file = XTTS.process_tts_to_file(text, lang_code, ref_speaker_wav, options, output_file_path)


return gr.make_waveform(audio=output_file),output_file

part 2 edited

    with gr.Column():
        video_gr = gr.Video(label="Waveform Visual",interactive=False)
        audio_gr = gr.Audio(label="Synthesised Audio",interactive=False, autoplay=False)
        generate_btn = gr.Button(value="Generate",size="lg")
        
        additional_text_input = gr.Textbox(label="File Name Value", value="user_input")

        generate_btn.click(
            fn=generate_audio,
            inputs=[text, languages, speaker_value_text, speaker_path_text, additional_text_input, temperature, length_penalty, repetition_penalty, top_k, top_p, speed, sentence_split],
            outputs=[video_gr, audio_gr]
        )

For those having issues with getting their fine tunes to develop/load in the xtts_finetune_webui.py

I've found a solution.
Use the collab version to produce your finely tuned model (youtube keyword xtts fine tuning -- very simple collab instructions)
Now once you've downloaded the complete fine-tune model from collab, through google drive,
take your 2.0.2 base model, ctrl+x, and paste it in a safe directory within the main project file and label it something obvious so you don't lose track of it, in case you need to use it again, which you probably will at some point.

Take your three fine-tune model files from google drive -- config, model, and vocab, and replace them with those original base model files, in that original base model directory where 2.0.2 (or whatever big base model version it is you use)

You can now run fine tuned models, which will produce crisp high quality voices quickly for a specific voice (or two, etc) from the original xtts_webui.py.

You're welcome!
/closed

Can't train with own dataset

Using my own audio and metadata files, it just says "model done training" as soon as I click train. I tried filling it in manually as it says in the script, but it still won't train. I feel like I must be missing something very obvious here - would greatly appreciate some help. Thank you.

Big Files

Hello, I have hundreds of hours of audio files and it takes a long time to upload them. Wouldn't it be better if we could move it to a folder instead?

Not an issue, a question on finetuning and creating models based on non-humanoid voices.

Love how you have this set up and I'm not sure where I should be asking this and if you could direct me to a better place that would be awesome.

However my question is, how would you create a model based on a robotic voice? Say glados from portal 2 for example, or curie from fallout 4. There are many samples of 'robotic' voices that have some seriously top-notch sound effects processed into them that would be fantastic to clone.
I tried to do one voice but it honestly reverse engineered her voice and took the speech effects off, it even kept her german accent which was astonishing.
Is there a way to do this?

daswer123 / xtts-webui Goto Github PK

xtts-webui's Introduction

XTTS-WebUI

Portable version

The Train tab is broken, if you want to train a model use a separate webui

Readme is available in the following languages

About the Project

Key Features

TODO

Installation

1 Method, through scripts

Windows

Linux

2 Method, Manual

Running The Application

Starting XTTS WebUI :

TTS -> RVC

Differences between xtts-webui and the official webui

Data processing

Fine-tuning XTTS Encoder

Inference

Other

xtts-webui's People

Contributors

Stargazers

Watchers

Forkers

xtts-webui's Issues

------------------- ERROR DETAILS ------------------------ arguments: [b'fugashi', b'-C'] param.cpp(69) [ifs] no such file or directory: c:\mecab\mecabrc

------------------- ERROR DETAILS ------------------------ arguments: [b'fugashi', b'-C'] param.cpp(69) [ifs] no such file or directory: c:\mecab\mecabrc

Recommend Projects

Recommend Topics

Recommend Org

Jobs

------------------- ERROR DETAILS ------------------------
arguments: [b'fugashi', b'-C']
param.cpp(69) [ifs] no such file or directory: c:\mecab\mecabrc

------------------- ERROR DETAILS ------------------------
arguments: [b'fugashi', b'-C']
param.cpp(69) [ifs] no such file or directory: c:\mecab\mecabrc