Hello and thank you so much for this great repo! Unfortunately, I ca

Could you describe which model you are using? <p dir="a

Generating speach in Russian with C# returns nonsence about sherpa-onnx HOT 6 OPEN

Onkitova commented on July 19, 2024

Generating speach in Russian with C# returns nonsence

from sherpa-onnx.

Comments (6)

csukuangfj commented on July 19, 2024 1

Sorry, I cannot access your link.

I just tested it locally on my macOS and it works perfectly.

dotnet run \
  --vits-model=./vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx \
  --vits-tokens=./vits-piper-ru_RU-irina-medium/tokens.txt \
  --vits-data-dir=./vits-piper-ru_RU-irina-medium/espeak-ng-data \
  --debug=1 \
  --output-filename=./hi.wav \
  --text="Как твои дела?"

It produces
hi.wav.txt

(Please rename it to hi.wav.)

Please make sure you use utf-8 encoding for your computer.

Please see the doc
https://k2-fsa.github.io/sherpa/onnx/tts/faq.html#how-to-enable-utf-8-on-windows

Sorry that this specific part is in Chinese. Many Chinese users have issues using the Chinese TTS models before making the changes to their computers to use UTF-8 encoding.

from sherpa-onnx.

csukuangfj commented on July 19, 2024 1

The C++ code expects string in UTF-8 encoding.

The code works fine on my macOS without any system changes. I am unsure why it causes issues on your and some other users' systems.

C# uses UTF-16 encoded strings.

The following line

sherpa-onnx/scripts/dotnet/offline.cs

Line 235 in 4f758e6

 private static extern IntPtr SherpaOnnxOfflineTtsGenerate(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string text, int sid, float speed); 

does the conversion automagically from UTF-16 to UTF-8.

but without touching windows settings

Sorry, I've no idea about how to fix it by changing the code (if it is indeed caused by code).

You can try reading the text from a utf-8 encoded file and see if it works.

from sherpa-onnx.

csukuangfj commented on July 19, 2024

Could you describe which model you are using?

from sherpa-onnx.

Onkitova commented on July 19, 2024

Could you describe which model you are using?

Of course! Lets make it clear: I am able to reproduce it with every of 4 RU-lang models. However, for the sake of analysis, lets say it is:

vits-piper-ru_RU-irina-medium.tar.bz2

Here is how it sounds with sherpa-onnx-non-streaming-tts-x64-v1.9.23.exe.

And here is what I got with c# (tried running both ways from .bat AND directly from code).

It is also absurdly longer (x6 times) than it should be in comparison to the right version.

One more thing that I can tell you is that vocal output is still not totally random. I can definitely hear every of 4 models speak the same thing, just not in Russian. Looks like, as if it lacks some pointer to specific language (in order to utilize related espeak-ng-data) with c# sample, while sherpa-onnx-non-streaming-tts-x64-v1.9.23.exe somehow manages to get it right by itself.

from sherpa-onnx.

Onkitova commented on July 19, 2024

Sorry, I cannot access your link.

I just tested it locally on my macOS and it works perfectly.
dotnet run \
  --vits-model=./vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx \
  --vits-tokens=./vits-piper-ru_RU-irina-medium/tokens.txt \
  --vits-data-dir=./vits-piper-ru_RU-irina-medium/espeak-ng-data \
  --debug=1 \
  --output-filename=./hi.wav \
  --text="Как твои дела?"
It produces hi.wav.txt

(Please rename it to hi.wav.)

Please make sure you use utf-8 encoding for your computer.

Please see the doc https://k2-fsa.github.io/sherpa/onnx/tts/faq.html#how-to-enable-utf-8-on-windows

Sorry that this specific part is in Chinese. Many Chinese users have issues using the Chinese TTS models before making the changes to their computers to use UTF-8 encoding.

Wow, enforcing utf-8 actually helped! Thank you!

Here is the final question then: maybe you got an idea as if there is some way to do the same, but without touching windows settings? I mean, somehow enforce utf-8 encoding from code, while passing such options.text param to sherpa? Or maybe I can somehow pass not string, but path to text file containing text to be spoken, while ensuring this specific file is utf-8 encoded?
I am asking, because sherpa-onnx-non-streaming-tts-x64-v1.9.23.exe managed to do right without any windows setting manipulation.

from sherpa-onnx.

Onkitova commented on July 19, 2024

The code works fine on my macOS without any system changes. I am unsure why it causes issues on your and some other users' systems.

I also can suppress this issue for myself if enforce UTF-8 (codepage 65001) system-wide (following the instruction from the link you provided). Thanks for that, once again. But I want to also embed sherpa-onnx into my software to be shared with other people and it would be incredibly awkward to ask every potential user to "go here and click that then reload" or "I played a little bit with your registry so now you need to reload and then application can finally work as intended". That's why I keep looking for solution.

C# uses UTF-16 encoded strings.
The following line

sherpa-onnx/scripts/dotnet/offline.cs

Line 235 in 4f758e6

private static extern IntPtr SherpaOnnxOfflineTtsGenerate(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string text, int sid, float speed);

does the conversion automagically from UTF-16 to UTF-8.

As I understand it, the problem is not in the C++ code, but on the side of C#, which due to the nature of UTF-16 strings incorrectly translates Cyrillic characters (and possibly Chinese characters too) when the system is not set to force everything to work on UTF-8 (codepage 65001).

You can try reading the text from a utf-8 encoded file and see if it works.

I tried a lot, with files, encoding focuses and so on. Unfortunately, no remedy found here.

Could you please consider adding another variant of SherpaOnnxOfflineTtsGenerate, for example SherpaOnnxOfflineTtsGenerateFromFile, which instead of literal text to voice will expect a path to a text file from string argument, from which C++ will extract the text to be voiced?
That way, I think we can get around this problem by simply using a text file with fixed utf-8 encoding as a proxy.

from sherpa-onnx.

Generating speach in Russian with C# returns nonsence about sherpa-onnx HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs