GithubHelp home page GithubHelp logo

Comments (19)

sweetbbak avatar sweetbbak commented on June 19, 2024

Is there any way that I can personally fix this?

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

Could you tell us how we can reproduce it?

I have tested it at
https://huggingface.co/spaces/k2-fsa/text-to-speech

Everything is working fine.

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

It only happens with the Android APK version. To reproduce, install the Android APK that I linked above, set it as the default text to speech engine and then synthesize speech on the Android device. I'll post a video of it.

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

By the way, what is the CPU on your Android phone?

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

Pixel 8 Pro Tensor G3 processor

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

Does only this model on your phone have such pauses? Do other models work well on your phone?

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

video of my model:
https://github.com/k2-fsa/sherpa-onnx/assets/75046310/7d35659d-106d-4520-a1ff-c79d058525dd

screen-20240512-180601.2.mp4

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

And here it does not with Glados and other models:

screen-20240512-181649.mp4

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

sweetbbak-amy en_GB is twice larger in file size than other models.

In other words, this model is so large that it takes a lot of time to synthesize a sentence on your phone.

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

I see, thanks for your help. Im training a new one right now thats going to be the lowest size, hopefully I can use that instead. Maybe I can submit that instead or potentially get some advice on building the APKs? I read the docs not that long ago and I am beyond lost, its a little over my head.

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

Maybe I can submit that instead or potentially get some advice on building the APKs?

Both are fine to me.

If you want to build an APK by yourself, you can follow our doc at
https://k2-fsa.github.io/sherpa/onnx/android/build-sherpa-onnx.html

Or you can open-source your onnx model and we can build the APK and make it public.

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

Thank you! Im going to open-source it for sure, but I will try to build it myself or I'll @ you in this thread or the thread you have in piper tts if that is okay.

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

Actually I have one question, I'm on the last step of building an APK out of a Piper model but I'm lost at this step. What type of model is piper-tts supposed to be in this context?

OnlineRecognizer.kt

        14 -> {
            val modelDir = "vits-piper-en_GB-sweetbbak-amy"
            return OnlineModelConfig(
            neMoCtc = OnlineNeMoCtcModelConfig(
                model = "$modelDir/en_GB-sweetbbak-amy.onnx",
                //model = "$modelDir/model.onnx",
                ),
                tokens = "$modelDir/tokens.txt",
            )
        }

and the hint in the wiki is:

If you select a different pre-trained model, make sure that you also change the corresponding code listed in the following screen shot:

but I can't find any information on what the piper models are supposed to be in the pre-trained models list

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

What type of model is piper-tts supposed to be in this context?

You have selected the wrong file.

You should use
https://github.com/k2-fsa/sherpa-onnx/blob/master/android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/Tts.kt

Please see
https://github.com/k2-fsa/sherpa-onnx/blob/master/android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/TtsEngine.kt

// Example 2:
// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
// https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2
// modelDir = "vits-piper-en_US-amy-low"
// modelName = "en_US-amy-low.onnx"
// dataDir = "vits-piper-en_US-amy-low/espeak-ng-data"
// lang = "eng"

The examples listed above should be straightforward to follow.

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

Make sure you are using the latest master branch.

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

By the way, are moels in
https://github.com/sweetbbak/Neural-Amy-TTS/tree/main/models
trained with piper?

If yes, I think they should be usable directly in sherpa-onnx.

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

Yea, they are all trained with Piper. I just finished one up and I wanted to test them to see what parameters give a good mix of quality and speed on Android. Im on commit #872 939fdd9

So, I can build SherpaOnnxTttsEngine but I can't get the TTS engine to switch to Kaldi on my phone.

I built and copied the arm64 ilbs into jniLibs in the SherpaOnnxTtsEngine android studio project, inserted my model which I just named the same as the other model for convenience, downloaded the espeak-ng-data, copied in the tokens.txt file, installed onnxruntime with python and ran the python script in the model directory and added my model to TtsEngine.kt like this:

        modelDir = "vits-piper-en_GB-sweetbbak-amy"
        modelName = "en_GB-sweetbbak-amy.onnx"
        dataDir = "vits-piper-en_GB-sweetbbak-amy/espeak-ng-data"
        lang = "eng"

image

from sherpa-onnx.

sweetbbak avatar sweetbbak commented on June 19, 2024

Nevermind, it randomly started working after trying a few times. Must be some weird bug on my phones end. I appreciate the help. Also, last question, where does the tokens.txt come from and is it generally safe to just re-use it for every english model or is their some process for converting *.onnx.json into tokens.txt?

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on June 19, 2024

Please see https://k2-fsa.github.io/sherpa/onnx/tts/piper.html
for how tokens.txt is generated.

def generate_tokens(config):
    id_map = config["phoneme_id_map"]
    with open("tokens.txt", "w", encoding="utf-8") as f:
        for s, i in id_map.items():
            f.write(f"{s} {i[0]}\n")
    print("Generated tokens.txt")

is it generally safe to just re-use it for every english model

Please always regenerate it with your .json file.

from sherpa-onnx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.