vlomme / multi-tacotron-voice-cloning Goto Github PK
View Code? Open in Web Editor NEWPhoneme multilingual(Russian-English) voice cloning based on
Home Page: https://github.com/CorentinJ/Real-Time-Voice-Cloning
License: Other
Phoneme multilingual(Russian-English) voice cloning based on
Home Page: https://github.com/CorentinJ/Real-Time-Voice-Cloning
License: Other
OSError: [WinError 127] Не найдена указанная процедура. Error loading "C:\Users\admin\PycharmProjects\pythonProject\venv\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
Пробовал разные методы и подходы, в общем ничего не помогает... Даже не знаю что и делать.
P.S. tensorflow-gpu поддерживается только до python-3.7, всё что выше будет писать ошибку что не нашло нужной версии.
Может кто сталкивался?
How can i add Arabic Language ??
Could not find a version that satisfies the requirement PyQt5 (from -r requirements.txt (line 13)) (from ver
sions: )
No matching distribution found for PyQt5 (from -r requirements.txt (line 13))
Thanks for work! Help me to train an encoder. How is it possible to add new custom voices to train datasets, or only fixed (like LibriSpeech: train-other-500, VoxCeleb1...) are available through the interface of commands:
python encoder_preprocess.py <datasets_root>
and
python encoder_train.py my_run <datasets_root>/SV2TTS/encoder
If possible, than how i should keep files, in root data directory or subfolders, in what formats? I tried to add my voice to subfolder but got an error like:
"Python encoder_preprocess.py data
Arguments:
datasets_root: data
out_dir: data/SV2TTS/encoder
datasets: ['preprocess_voxforge']
skip_existing: False
Preprocessing preprocess_voxforge
Couldn't find data/book, skipping this dataset"
I looked at the source and found that there are fixed funcs that preprocess different formats of train data (like preprocess22,preprocess44...) What do they mean? Maybe i should use one of them?
Thank you.
Hello,
I tried using the demo with the pre-trained network.
See the attached example. The output is just pure noise.
Is there anything wrong with the voice sample i provided as input?
I tried both male / female.
Thanks,
orig_16.zip
Спасибо больше за проект, я ищу какое-то быстрое решение для себя, которое могло бы делать voice style transfer (пародировать записанный голос по семплу), можно ли применить ваш синтезатор для этой задачи?
Is it possible to use WaveGlow?
And not to train your own model but use pretrained?
❯ python3 -m pip install -r requirements.txt
ERROR: Could not find a version that satisfies the requirement tensorflow-gpu<=1.14.0 (from -r requirements.txt (line 1)) (from versions: 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.2.1, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0, 2.3.1)
ERROR: No matching distribution found for tensorflow-gpu<=1.14.0 (from -r requirements.txt (line 1))
❯ python --version
Python 3.8.5
First of all, thank you for sharing the open-source of Multi-Tacotron-Voice-Cloning. I also just started learning about natural language processing programming. And I also started learning Python programming.
-I put the software in the directory: D: \ SV2TTS
-I put the dataset in the directory: D: \ Datasets, I have D: \ Datasets \ book and D: \ Datasets \ LibriSpeech
When using the code you provided, I had some training issues:
My question: How can I fix this problem?
Thanks again for your sharing!!!
I have a problem inserting additional text to be spoken into the toolbox. The additional lines cause the vocoder to crash with out of memory error. Trying with the original code from CorentinJ and code from here I found that activating g2p in toolbox / __ init__ caused this error.
Apparently the g2p binds the resources that are important for other neural networks and does not release them when it has finished its task.
Can you fix that somehow?
Thank you
Hi. Thank you.
Is it possible to not train your own Tacotron2 and use pretrained model on russian language?
This for example https://github.com/alphacep/tn2-wg
Tryed to run google collab, but there is an error. Please help.
How to implement it on local Computer?
Предположим нужно добиться хорошего звучания всего одного голоса(русский). Куда нажимать(инструкция для хомяков).
Интересует твоё оценочное суждение, может сэмплы накидаешь, чтобы поиметь представление?
Я пробовал https://github.com/CorentinJ/Real-Time-Voice-Cloning и результатом не впечатлён, куча каких то шумов, на пару тонов выше речь.
Здравствуйте, экспериментировал с вашей моделью, но лишь некоторые записи дают хороший результат. На Хабре вы писали, что пробовали так же обучать модель только для русского языка и она работала лучше, у вас не осталось натренерованной модели? Если да, не могли бы вы поделиться ей?
Hi @vlomme,
Great work here, and thanks for open-sourcing it. I'm trying to understand how this works so that I can replicate it. I've gone through the code and don't see any language embedding, which I thought would be how you separate the speaker from the language.
Can you please explain how language-speaker independence is achieved?
Hi I am doing similar work like yours, my datasets is "En + Chineses".
I have tried the pretrained model offered by CorentinJ, and also finetune on the pretrained model, but i have not achieve good results till now. I am still training the encoder model now. And I wonder if you have some good results to share?
Доброго времени суток!
Как начать С НУЛЯ обучать нейронку? (т.е не нужен pretrained model)
first, thanks for such a complete pipeline.
second, would you integrate e.g. this repo for native russian support?
Hello,
Amazing work.
I am running inference using your models on 2080 gpu. your example is perfect. But when I give a new audio clip (in English) and make it say the same Russian sentence, the output audio isn't good. There's lot of noise, and cloning is not even of good quality.
My question is:
Thanks,
S
File "...\g2p\train.py", line 4, in
from distance import levenshtein
ModuleNotFoundError: No module named 'distance'
Привет, у меня не видит dataset RU. LibriSpeech видит, а русский dataset не видит. Что делать? Спасибо
Здравствуйте. У меня видео-карта GeForce GT 630M. Уж очень нужно запустить программу, но там у PyTorch минимум compute capability 3.0, а на моей видеокарте 2.1
Нужен именно интерфейс, для изучения графиков, на Colab только ввиде командной строки.
Если устанавливаю версию только CPU, то пишет, что не поддерживается.
Может быть есть способ отключить использование GPU и перейти на CPU?
Hello, I am looking for a way So I can able to make a Japanese speaker speaks English, is it possible?
Hi,
I'm slightly confused in G2P model. Let's suppose If I need to train a model which specifically translates from Russian to English (Only). Do I still need to add dictionary or train G2P model ?
Also, I'm not able to catch the significance of G2P model here. We have the synthesizer, which is already doing the same work.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.