GithubHelp home page GithubHelp logo

Comments (9)

DanRuta avatar DanRuta commented on July 24, 2024

Hey. For the first issue, please check that you have downloaded the "priors" data, from nexusmods. They are currently not shipped with the Steam build

from xva-trainer.

stohrendorf avatar stohrendorf commented on July 24, 2024

Thanks for the clarification, this gave me a few headaches. Consider the first issue to be a wish for a better error message now ;)

from xva-trainer.

stohrendorf avatar stohrendorf commented on July 24, 2024

Update: sorry, but downloading the data files and extracting them didn't solve the issue. I have downloaded both data files and extracted them using 7zip, but it's showing up the same error with the same stacktrace. This is the directory layout after extracting:

Update: Windows and 7zip struggled so much that (for some reason) it showed files that were not actually there.

from xva-trainer.

DanRuta avatar DanRuta commented on July 24, 2024

Is it working ok now?

If not, the other issue might be finetuning dataset formatting, if the audio files can't be found by the app. To clarify, there should be wav files inside the "wavs" folder, and next to the "wavs" folder there is a metadata.csv file with | formatting. It should look the same as any of these priors datasets.

So in the app, in the training config, the dataset path for "ar_priors_x", if hypothetically that was your custom dataset, should be ...../resources/app/xvapitch/PRIORS/ar_priors_x

from xva-trainer.

stohrendorf avatar stohrendorf commented on July 24, 2024

The dataset itself isn't the problem, it's about adding training configurations. I have added copies of the dataset with cleaned up WAVs from different folders already, but when training tasks are in the list, and I try to add another one, it just doesn't appear in the list, no matter whether I re-use a dataset from another training task, or if I select an unused dataset.

On a side note, the training stopped after a few hours with an OOM (system RAM, not VRAM), which was a bit surprising given that I have 64GB.

from xva-trainer.

DanRuta avatar DanRuta commented on July 24, 2024

There is currently a fair bit of data caching in the dataloader. I've removed it for the next update, but meanwhile you can reduce the number of workers in the training config, which should use up less RAM.

For the training queue, would you please be able to share the app.log file located next to the xVATrainer.exe? It might have an error stack to indicate what's wrong.

from xva-trainer.

stohrendorf avatar stohrendorf commented on July 24, 2024

Hm. I can't reproduce it anymore now, yet I'm certain it happened multiple times. The log file is fairly uninteresting, except for this single line, but that's probably just an invalid training configuration: [line: 702] onerror: Uncaught TypeError: trainingAddConfigCkptPathInput.replaceAll is not a function.

About the data caching, I have moved every reference data set except for the speaker's language to a different folder, which still eats ~30 GB of RAM, but it doesn't go OOM anymore - are there consequences for excluding most "priors" datasets?

from xva-trainer.

DanRuta avatar DanRuta commented on July 24, 2024

Just fixed that error. As for not including priors, every priors folder contains some synthetic data for a different language. This data is used during training to ensure that the models you fine-tune on mono-language, and mono-speaker-style will not lose any knowledge of the other languages, nor vocal range (useful for voice conversion, pitch/emotion/style manipulation). I recommend not messing with the priors datasets, unless you choose to add MORE data to them (eg your own, higher quality non-synthetic data).

I'm pushing an update through today, which makes the training consume less system RAM.

from xva-trainer.

stohrendorf avatar stohrendorf commented on July 24, 2024

Oh, okay, that may explain why the voice synthesis I did is so emotionless. Thanks for that explanation. I'm wondering whether re-training from scratch or just adding the additional priors later on makes a huge difference, though.

As a side note, I created a new base dataset with basically complete sentences of reference voice, and the UI hint about VRAM and batch size ratio doesn't align with that, i.e. it went OOM with a batch size of 12 (I have 12 GB of VRAM) - using the voice samples I have, and using a batch size of 8, it uses about 10 GB of VRAM, and around 30 GB of system RAM (although that's because I removed every other language from the priors). I'm just guessing here, but it seems that at certain points, it needs an additional 1-2 GB just to save the checkpoints, which in turn leads to an OOM if the batch size is too large. In other words, it seems like the UI hint about the batch size seems to be a bit misleading.

from xva-trainer.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.