it sounds great but it is unbearable to listen to as it pauses for a very long time at

video of my model: <a href="https://github.com/k2-fsa/sherpa-onnx/assets/75046310/

And here it does not with Glados and other models: <details open="" class="details

TTS model sweetbbak-amy en_GB has ver long 5+ second pauses after every sentence. about sherpa-onnx HOT 19 CLOSED

sweetbbak commented on June 19, 2024

TTS model sweetbbak-amy en_GB has ver long 5+ second pauses after every sentence.

from sherpa-onnx.

Comments (19)

sweetbbak commented on June 19, 2024

Is there any way that I can personally fix this?

from sherpa-onnx.

csukuangfj commented on June 19, 2024

Could you tell us how we can reproduce it?

I have tested it at
https://huggingface.co/spaces/k2-fsa/text-to-speech

Everything is working fine.

from sherpa-onnx.

sweetbbak commented on June 19, 2024

It only happens with the Android APK version. To reproduce, install the Android APK that I linked above, set it as the default text to speech engine and then synthesize speech on the Android device. I'll post a video of it.

from sherpa-onnx.

csukuangfj commented on June 19, 2024

By the way, what is the CPU on your Android phone?

from sherpa-onnx.

sweetbbak commented on June 19, 2024

Pixel 8 Pro Tensor G3 processor

from sherpa-onnx.

csukuangfj commented on June 19, 2024

Does only this model on your phone have such pauses? Do other models work well on your phone?

from sherpa-onnx.

sweetbbak commented on June 19, 2024

video of my model:
https://github.com/k2-fsa/sherpa-onnx/assets/75046310/7d35659d-106d-4520-a1ff-c79d058525dd

screen-20240512-180601.2.mp4

from sherpa-onnx.

sweetbbak commented on June 19, 2024

And here it does not with Glados and other models:

screen-20240512-181649.mp4

from sherpa-onnx.

csukuangfj commented on June 19, 2024

sweetbbak-amy en_GB is twice larger in file size than other models.

In other words, this model is so large that it takes a lot of time to synthesize a sentence on your phone.

from sherpa-onnx.

sweetbbak commented on June 19, 2024

I see, thanks for your help. Im training a new one right now thats going to be the lowest size, hopefully I can use that instead. Maybe I can submit that instead or potentially get some advice on building the APKs? I read the docs not that long ago and I am beyond lost, its a little over my head.

from sherpa-onnx.

csukuangfj commented on June 19, 2024

Maybe I can submit that instead or potentially get some advice on building the APKs?

Both are fine to me.

If you want to build an APK by yourself, you can follow our doc at
https://k2-fsa.github.io/sherpa/onnx/android/build-sherpa-onnx.html

Or you can open-source your onnx model and we can build the APK and make it public.

from sherpa-onnx.

sweetbbak commented on June 19, 2024

Thank you! Im going to open-source it for sure, but I will try to build it myself or I'll @ you in this thread or the thread you have in piper tts if that is okay.

from sherpa-onnx.

sweetbbak commented on June 19, 2024

Actually I have one question, I'm on the last step of building an APK out of a Piper model but I'm lost at this step. What type of model is piper-tts supposed to be in this context?

OnlineRecognizer.kt

        14 -> {
            val modelDir = "vits-piper-en_GB-sweetbbak-amy"
            return OnlineModelConfig(
            neMoCtc = OnlineNeMoCtcModelConfig(
                model = "$modelDir/en_GB-sweetbbak-amy.onnx",
                //model = "$modelDir/model.onnx",
                ),
                tokens = "$modelDir/tokens.txt",
            )
        }

and the hint in the wiki is:

If you select a different pre-trained model, make sure that you also change the corresponding code listed in the following screen shot:

but I can't find any information on what the piper models are supposed to be in the pre-trained models list

from sherpa-onnx.

csukuangfj commented on June 19, 2024

What type of model is piper-tts supposed to be in this context?

You have selected the wrong file.

You should use
https://github.com/k2-fsa/sherpa-onnx/blob/master/android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/Tts.kt

Please see
https://github.com/k2-fsa/sherpa-onnx/blob/master/android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/TtsEngine.kt

sherpa-onnx/android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/TtsEngine.kt

Lines 72 to 78 in 384f96c

 // Example 2: 

 // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models 

 // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2 

 // modelDir = "vits-piper-en_US-amy-low" 

 // modelName = "en_US-amy-low.onnx" 

 // dataDir = "vits-piper-en_US-amy-low/espeak-ng-data" 

 // lang = "eng"

The examples listed above should be straightforward to follow.

from sherpa-onnx.

csukuangfj commented on June 19, 2024

Make sure you are using the latest master branch.

from sherpa-onnx.

csukuangfj commented on June 19, 2024

By the way, are moels in
https://github.com/sweetbbak/Neural-Amy-TTS/tree/main/models
trained with piper?

If yes, I think they should be usable directly in sherpa-onnx.

from sherpa-onnx.

sweetbbak commented on June 19, 2024

Yea, they are all trained with Piper. I just finished one up and I wanted to test them to see what parameters give a good mix of quality and speed on Android. Im on commit #872 939fdd9

So, I can build SherpaOnnxTttsEngine but I can't get the TTS engine to switch to Kaldi on my phone.

I built and copied the arm64 ilbs into jniLibs in the SherpaOnnxTtsEngine android studio project, inserted my model which I just named the same as the other model for convenience, downloaded the espeak-ng-data, copied in the tokens.txt file, installed onnxruntime with python and ran the python script in the model directory and added my model to TtsEngine.kt like this:

        modelDir = "vits-piper-en_GB-sweetbbak-amy"
        modelName = "en_GB-sweetbbak-amy.onnx"
        dataDir = "vits-piper-en_GB-sweetbbak-amy/espeak-ng-data"
        lang = "eng"

from sherpa-onnx.

sweetbbak commented on June 19, 2024

Nevermind, it randomly started working after trying a few times. Must be some weird bug on my phones end. I appreciate the help. Also, last question, where does the tokens.txt come from and is it generally safe to just re-use it for every english model or is their some process for converting *.onnx.json into tokens.txt?

from sherpa-onnx.

csukuangfj commented on June 19, 2024

Please see https://k2-fsa.github.io/sherpa/onnx/tts/piper.html
for how tokens.txt is generated.

def generate_tokens(config):
    id_map = config["phoneme_id_map"]
    with open("tokens.txt", "w", encoding="utf-8") as f:
        for s, i in id_map.items():
            f.write(f"{s} {i[0]}\n")
    print("Generated tokens.txt")

is it generally safe to just re-use it for every english model

Please always regenerate it with your .json file.

from sherpa-onnx.

TTS model sweetbbak-amy en_GB has ver long 5+ second pauses after every sentence. about sherpa-onnx HOT 19 CLOSED

Comments (19)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	// Example 2:
	// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
	// https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2
	// modelDir = "vits-piper-en_US-amy-low"
	// modelName = "en_US-amy-low.onnx"
	// dataDir = "vits-piper-en_US-amy-low/espeak-ng-data"
	// lang = "eng"