GithubHelp home page GithubHelp logo

vosk-android-service's Introduction

vosk android service

This is a service module for android, allowing other applications to call vosk to perform speech to text.

vosk-android-service's People

Contributors

doomsdayrs avatar drew-sinha avatar nshmyrev avatar punnales avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vosk-android-service's Issues

[Branding] New Icon

As stated in #22 I do not think anyone enjoys the current icon.

I made an icon with a White star and the brand color from the website, trying to match the favicon of the website.

installation fails on Android 12: INSTALL_PARSE_FAILED_MANIFEST_MALFORMED

$ adb install -r vosk-android-service-0.3.42.apk
Performing Streamed Install
adb: failed to install vosk-android-service-0.3.42.apk: Failure [INSTALL_PARSE_FAILED_MANIFEST_MALFORMED: Failed parse during installPackageLI: /data/app/vmdl910797472.tmp/base.apk (at Binary XML file line #106): org.vosk.service.VoskRecognitionService: Targeting S+ (version 31 and above) requires that an explicit value for android:exported be defined when intent filters are present]

v-a-s does not activate (with ASK) until app started

Behavior to Reproduce:

  1. Restart/turn on device and go to home screen.
  2. Open up app that uses (AnySoftKeyboard) keyboard app.
  3. Tap mic phone to attempt voice recognition.

Expected Behavior:
Voice recognition starts.

Actual Behavior:
ASK's download activity notifies user that voice recognition app not installed and prompts to download app from Play store

**Of note, if vosk app started prior to attempting voice recognition through keyboard app, expected behavior occurs. Expected behavior also occurs if vosk app is force stopped through system utilities (vosk app verified stopped in background through developer settings).

Posting as a v-a-s issue given that above behavior illustrates some app/service state issue that favors cause from within v-a-s vs ASK.

*configuration
Source: [5e02806], current as of initial issue post
Build Configuration: Gradle Toolkit command-line, debug w/universal apk (compilesdk 33)
Gradle toolkit version: 7.6 (defaults despite the build kts depending on 7.2.2; no api level spec'd); builds against OpenJDK-14
Device: Pixel 3
Device OS: Android 12
Additional Device Apps: AnySoftKeyboard (v1.11.7137/F-droid; UTD)

Better model list

a) Not clear which models are downloaded
b) No way to remove the model data. 
c) No display of the model language while our json has that information
d) No display of the model size

Swiftkey init slow

In Swiftkey works somehow but initalizes slow and doesn't insert recognized text into the field, not a great user experience.

[FR] in-app test [button]

it seems that recognition doesn't always work with any keyboard.
A test [button] that does exactly as a Mic button on a keyboard can make testing easier and straightforward.

Incompatibility with various Android keyboards; wrap vosk-android in IME service (especially for standalone use/accessibility)

As of 5e02806*, the vosk service is fully functional/compatible with AnySoftKeyboard, but incompatible with OpenBoard and FlorisBoard**. Both of the latter use the inputmethodmanager framework as opposed to interacting with the speechrecognition service, and do not identify the vosk service as an input method.

Given that there is no open-source stt alternative to Google, etc. at time of posting (3/2023), relying on the SpeechRecognitionService is appropriate***. The vosk service works totally fine as a (de-facto) plugin for AnySoft. However, switching to/stuffing an IME service on top of vosk-android as a standalone service would be nice for accessibility (i.e. for those with disabilities due to which it would make sense to use voice as the IME). This isn't unreasonable given that Google already does this with speech services.

Without significant experience with IMM/IMEs, I think that this should be pretty straightforward: add an intermediate level activity on top of the vosk-recognition-service that can be forked off in the manifest as its own service. Then, the given keyboard can decide which service to latch onto for STT.

*additional configuration:
Build Configuration: Gradle Toolkit command-line, debug w/universal apk (compilesdk 33)
Gradle toolkit version: 7.6 (defaults despite the build kts depending on 7.2.2; no api level spec'd); builds against OpenJDK-14
Device: Pixel 3
Device OS: Android 12
Additional Device Apps: AnySoftKeyboard (v1.11.7137/F-droid; UTD)

**nothing special per se about these two keyboards. I chose them as the major open alternatives I've seen on reddit and f-droid. Of note, I haven't tested konele, but would be surprised if it wasn't compatible given @ccoreilly's efforts with localstt.

***I am hesitant to say that choosing that IMM/IME is better vs direct-speech recognition service, or on any ime designer's preference to use either. Per above, I don't think that either are incompatible per se, and can be construed to have separate use cases. Any thoughts would be appreciated. Tagging some people who may have some useful input: @patrickgold, @dslul, @ewheelerinc, @ildar, @Kaljurand, @Felicis. CC:@Stypox

Edit: misspelled kaljurand, added stypox.

Empty screen on start

Hi, I would like to say thank you for your hard work. Unfortunately, the apk you have provided doesn't work as it should. When I start the app nothing happens. It doesn't ask if I want to download new models or anything. It just sits on main screen. Tried on Oneplus 7 Pro A10, Huawei P10 A9, Xiaomi Note 10 A11. When I choose it as voice typing service in android settings it pops up but doesn't seem to work, nothing happens.
Screenshot_20220718-232052
Do you mind checking what could be wrong? Thanks a lot in advance

Does v-a-s need to handle the assist intent?

@nshmyrev , v-a-s handling the assist intent leads to it being prompted as a possible assistant (which it doesn't do successfully on Android). Does including an ability to handle this intent satisfy a mission-critical feature on the Wear/CM, or can this be removed?

(this lead to some confusion in #33, and is a consistent fail case for me where if i accidentally set as assistant, I have to uninstall and re-install the app to my device.)

"Error Loading Recognizer" / Unhandled Page Fault

I installed the latest released apk, 0.3.42, set it as my default voice assistant, opened Fennec (F-Droid Firefox build), tapped the icon by the URL bar, and I get a toast saying "Error Loading Recognizer". Firing up adb I see this:

$ adb -s "wa6d93236" logcat | grep vosk
03-25 12:58:31.719   763   779 I ActivityTaskManager: START u0 {act=android.speech.action.RECOGNIZE_SPEECH cmp=org.vosk.demo/org.vosk.service.ui.SpeechRecognizerActivity (has extras)} from uid 10250
03-25 12:58:27.728     0     0 D org.vosk.demo: unhandled page fault (11) at 0x00000000, code 0x005
03-25 12:58:27.728     0     0 W Pid     : 30776, comm:        org.vosk.demo

SM-G900V running Android 11 (Lineage)

A couple of other questions:

(1) The AOSP keyboard has a microphone button built in, but it doesn't seem to bring this up, no error, no message in ADB. Should that work? Or I guess rather, does anyone have any theories of why that isn't working, but Fennec's assistant-mic button does?
(2) Does this have a model packed in with it? I'm assuming so since there are no directions to download one.

[Meta] Matrix chat room

A matrix chat room would be nice, Matrix is an open source platform for communication, and is already used by organizations such as Fedora, GNOME, KDE, AsteroidOS, Etc.

https://matrix.org/

[Feature Request] README

Please provide a README explaining the purpose of this service, how it works and how to test it.

Speech recognizer won't start

Hi,

When started, the SpeechRecognizerActivity gets stuck immediately.
Logcat says "SpeechRecognizer ..... org.vosk.service ..... E .. bind to recognition service failed" , when calling speechRecognizer.startListening(speechRecognizerIntent)
This error happens with both AnySoftKeyboard and SwiftKey (I didn't try other keyboards).

Note: I have the same error when using LocalSTT

Android 9 (api 28)
vosk-android-service 0.3.42

Screenshot_20230223-151818_Vosk

Migration to Kotlin

Android is using Kotlin more and more, It is wise to start coding in Kotlin too.

I suggest starting with conversion of the gradle scripts to kotlin scripts, then the rest can follow.

No UI / way to set a language?

I just installed the APK, and when opening it, there is no UI and there is generally no way to set a language or anything.

I even downloaded the German language model manually. But where do I put it?

(My ultimate goal is to create my own voice personalized assistant, specifically just by using a offline TTS engine, a shell script (in Termux) and an adapter to search engines (which adds a bit of semantic information). But having offline voice recognition on Android in general is of course the best way to start.)

Better intent UI

Speech UI has input field which leads to recursion. Better display a microphone button instead (Google style)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.