GithubHelp home page GithubHelp logo

Comments (9)

xenotropic avatar xenotropic commented on September 27, 2024 1

I think you need to download model first by starting the app and selecting the language

Okay, so I needed that fact, I thought it was just a freestanding service. Trying to open it though I seemed to get a #11 , which presumably is related to the adb log messages above. That is from the release in this repo, 0.3.42.

I then tried building from current state of the repo (same blank-screen behavior, did not check adb log); finally I tried @sogaiu 's builds mentioned in #11 , here and those did work, I get a list of models to download. I shall carry on using that build, but FWIW.

from vosk-android-service.

nshmyrev avatar nshmyrev commented on September 27, 2024

The AOSP keyboard has a microphone button built in, but it doesn't seem to bring this up, no error, no message in ADB. Should that work? Or I guess rather, does anyone have any theories of why that isn't working, but Fennec's assistant-mic button does?

Not sure about AOSP keyboard, one can look in the sources. We have recent fixes after which it reported to work with AnyKeyboard:

#32

(2) Does this have a model packed in with it? I'm assuming so since there are no directions to download one.

I think you need to download model first by starting the app and selecting the langauge

from vosk-android-service.

drew-sinha avatar drew-sinha commented on September 27, 2024

The AOSP keyboard has a microphone button built in, but it doesn't seem to bring this up, no error, no message in ADB. Should that work? Or I guess rather, does anyone have any theories of why that isn't working, but Fennec's assistant-mic button does?

Yeah... see #32. I messed around with AOSP + vosk-a-serv. My working theory is that it's due to AOSP looking for an IME service, not a speech recog service that handles the recognition Intent (case in point, if one opens up the openboard app and then goes to the preferences, voice typing is greyed out/displays nothing that handles voice input).

For this #11 issue, weird. If you don't set it as your voice assistant, does it work ok? I hadn't actually been using it this way, but when tinkering set it to voice assistant and then it doesn't work quite right (but still wasn't getting the blank screen). I would logcat with the new build from the current state of the repo, just to make sure whatever is breaking isn't something new.

from vosk-android-service.

xenotropic avatar xenotropic commented on September 27, 2024

Current page fault.
Logcat from my own build of the current state of the repo looks the same, unhandled page fault on app open; blank screen.

03-27 08:51:21.022   763  2403 I ActivityTaskManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=org.vosk.service/.ui.selector.ModelListActivity bnds=[36,90][288,385]} from uid 10147
03-27 08:51:16.683     0     0 D rg.vosk.service: unhandled page fault (11) at 0x00000000, code 0x005
03-27 08:51:16.683     0     0 W Pid     : 14727, comm:      rg.vosk.service
03-27 08:51:21.242   763   790 I ActivityTaskManager: Displayed org.vosk.service/.ui.selector.ModelListActivity: +180ms

This is an older device (Galaxy S5) so I'm using the armabi-v7a in case that makes a difference. So the sogaiu build is the only one that does not pagefault for me on app open, and the only one that gives a list of models to download (as among 0.3.42, sogaiu build, my gradle build of current state of repo). Seems like a bug existed, was fixed, and then was reintroduced? Hard to say for sure but superficially that's what it looks like.

Sogaiu build pagefault / Fennec. From within Fennec (Firefox F-Droid), calling Vosk as an assistant, even with the sogaiu build with EN-US model downloaded, I get no "Error Loading Recognizer" but do see a pagefault in logcat. Behavior is that the assisant pop-up window shows up with a mic icon with "VOSK" text over it and "try saying something" under it. The mic icon stays black; no speech is recognized; it stays that way until I hit "back". Logcat:

03-27 09:03:21.430 17967 17967 I ResolverListAdapter: Add DisplayResolveInfo component: ComponentInfo{org.vosk.service/org.vosk.service.ui.SpeechRecognizerActivity}, intent component: ComponentInfo{org.vosk.service/org.vosk.service.ui.SpeechRecognizerActivity}
03-27 09:03:23.582   763  1778 I ActivityTaskManager: START u0 {act=android.speech.action.RECOGNIZE_SPEECH flg=0x3000000 cmp=org.vosk.service/.ui.SpeechRecognizerActivity (has extras)} from uid 10250
03-27 09:03:19.295     0     0 D rg.vosk.service: unhandled page fault (11) at 0x00000000, code 0x005
03-27 09:03:19.296     0     0 W Pid     : 16584, comm:      rg.vosk.service
03-27 09:03:24.013  7059  7059 V GrantPermissionsActivity: Permission grant result requestId=-1354698035177607463 callingUid=10268 callingPackage=org.vosk.service permission=android.permission.RECORD_AUDIO isImplicit=false result=5
03-27 09:03:24.014  7059  7059 V GrantPermissionsActivity: Permission grant result requestId=-1354698035177607463 callingUid=10268 callingPackage=org.vosk.service permission=android.permission.WRITE_EXTERNAL_STORAGE isImplicit=false result=1
03-27 09:03:24.014  7059  7059 V GrantPermissionsActivity: Permission grant result requestId=-1354698035177607463 callingUid=10268 callingPackage=org.vosk.service permission=android.permission.READ_EXTERNAL_STORAGE isImplicit=true result=1

I'm not totally sure what the permissions messages are there, but it does ask for mic permissions and I grant them. In app info, mic is then shown as "always allowed" and no other permissions were requested (i.e., none shown as "denied").

What does work is to use the sogaiu build as a service for Kõnele, either using Kõnele's digital-assistant mode or its speak & swipe keyboard. It's slow -- it takes two seconds to load the model -- the accuracy is quite low (compared to the large vosk model using nerd-dictation that I'm used to on my laptop; this is a small model on a slower device), but it works. Logcat for that, while I'm here (not sure what these minor/major faults are?)

03-27 09:10:55.213 16584 16584 D VoskRecognitionService: /data/user/0/org.vosk.service/files/models/vosk-model-small-en-us-0.15/vosk-model-small-en-us-0.15
03-27 09:10:55.579   763 23922 E ActivityManager:   2.7% 16584/org.vosk.service: 0.4% user + 2.2% kernel / faults: 3331 minor 156 major
03-27 09:10:59.695 16584 22087 I VoskAPI : ReadDataFiles():model.cc:248) Loading i-vector extractor from /data/user/0/org.vosk.service/files/models/vosk-model-small-en-us-0.15/vosk-model-small-en-us-0.15/ivector/final.ie
03-27 09:11:01.692 16584 22087 I VoskAPI : ReadDataFiles():model.cc:281) Loading HCL and G from /data/user/0/org.vosk.service/files/models/vosk-model-small-en-us-0.15/vosk-model-small-en-us-0.15/graph/HCLr.fst /data/user/0/org.vosk.service/files/models/vosk-model-small-en-us-0.15/vosk-model-small-en-us-0.15/graph/Gr.fst
03-27 09:11:10.860 16584 22087 I VoskAPI : ReadDataFiles():model.cc:302) Loading winfo /data/user/0/org.vosk.service/files/models/vosk-model-small-en-us-0.15/vosk-model-small-en-us-0.15/graph/phones/word_boundary.int
03-27 09:11:30.592   763 24909 I ActivityManager:   vis    BFGS  607774: org.vosk.service (pid 16584) service
03-27 09:11:30.592   763 24909 I ActivityManager:                        org.vosk.service/.VoskRecognitionService<=Proc{20249:ee.ioc.phon.android.speak/u0a266}
03-27 09:11:30.592   763 24909 I ActivityManager:                        ee.ioc.phon.android.speak/.service.WebSocketRecognitionService<=Proc{16584:org.vosk.service/u0a268}
03-27 09:11:46.427   763 25083 E ActivityManager:   0% 16584/org.vosk.service: 0% user + 0% kernel / faults: 692 minor 15 major

from vosk-android-service.

drew-sinha avatar drew-sinha commented on September 27, 2024

I'm worried this is a bug that hasn't been addressed. It seems like @sogaiu's build isn't completely immune to the aforementioned issues. And, since @sogaiu's build, there have been few substantive changes to master (just issues with how the locale is detected, and the gradle version was up'd to 7.6).

Do you have a larger logcat for these latter use cases (I'm worried I'm missing something else in the background... which I unfortunately do a lot)?

Good catch with the v7a apk. does this persist with the universal (what I've been using)?

After re-reading the 2nd error case, @xenotropic. when you open fennec, it sounds like you're invoking as the system assistant. Can you provide more information on that? Are you still invoking through a keyboard (AOSP or otherwise)? Or, do you only have the service installed as the only method of input?

Finally, if you don't use as a system assistant and try anysoftkeyboard, is this improved? (I have only been using anysoft. From your "success" in the 3rd case with k6nele, I'm wondering if going through keyboards fixes the larger issue.)

Edits.... clarified my stupidity.
(Also, sorry for the 17 questions about "oh does this other stupid thing fix your error." Since starting to futz with apps, I've been highly surprised at how many failure points these things can have.)

from vosk-android-service.

xenotropic avatar xenotropic commented on September 27, 2024

Okay, so using universal from building latest does give a dialog and allows for a download, and does not page fault. Good suggestion. But it still doesn't work (except via Konele).

Anysoftkeyboard seems to have changed at some point to only support Google voice-to-text. The mic icon gives a popup saying "Voice is not installed" with an "install" button that takes me to the Play store for "com.google.android.voicesearch". I don't see any settings in it to choose a voice to text service. See also AnySoftKeyboard/AnySoftKeyboard#3230

Yes Fennec case is invoking as system assistant. The browser has a "mic" button by the URL bar that seemingly invokes the default digital assistant for stt purposes. It works with Google voice tying, works with Konele (using vosk-android-service as the service); it does not work with vosk invoked directly. Vosk-android-service shows up as a possibility as a handler for it though: I get a "complete action using" dialog with Vosk as an option, it brings up a UI element (see "Behavior is that the assisant pop-up..." in my last comment), but then I still get the "Error Loading Recognizer".

I also tried Swiftkey (Microsoft's keyboard) which has similar behavior to Fennec, brings up a black mic with "Vosk" / "Try Saying Something" and a toast "Error Loading Recognizer", and no text recognition.

The full noisy log, invoking from Switftkey, is here. The main thing that jumps out at me is

03-27 13:48:59.771 28130 28130 E RecognitionService: call for recognition service without RECORD_AUDIO permissions

It did ask for mic permissions when I first launched it, though. Going and looking in Vosk's app permissions it has "Microphone" as "Allow all the time" and "No permissions denied".

from vosk-android-service.

drew-sinha avatar drew-sinha commented on September 27, 2024

See also AnySoftKeyboard/AnySoftKeyboard#3230

Oooh interesting. Haven't dug far into the source code but for k6nele, but looks like it should implement the ime service framework based on the manifest. Confusing enough that it also does the speech recognition service (see #32 as well), but something is afoot.

Anysoftkeyboard seems to have changed at some point to only support Google voice-to-text. The mic icon gives a popup saying "Voice is not installed" with an "install" button that takes me to the Play store for "com.google.android.voicesearch". I don't see any settings in it to choose a voice to text service.

.... This will be the most asinine suggestion. If you don't make vosk-android a device assistant and open up the app before you start using the keyboard in e.g. texting, does it work?

(Nasty surprise on my end: it's only working if I activate the app itself prior to using any other app with it. Ie won't identify v-a-s and will display the install/play download prompt)

Must have always needed to open it up first for whatever debugging i was doing, and then just never turned my phone off after.)

Yes Fennec case is invoking as system assistant. The browser has a "mic" button by the URL bar that seemingly invokes the default digital assistant for stt purposes. It works with Google voice tying, works with Konele (using vosk-android-service as the service); it does not work with vosk invoked directly. Vosk-android-service shows up as a possibility as a handler for it though: I get a "complete action using" dialog with Vosk as an option, it brings up a UI element (see "Behavior is that the assisant pop-up..." in my last comment), but then I still get the "Error Loading Recognizer".

Interesting too. This is starting to make sense a little bit. TBF, from the fenix source, looks like it is doing everything through the intent framework (which should be compatible with v-a-s). Consequently though, that should mean that you shouldn't have to use v-a-s as a device assistant.

... I'm wondering if this is somewhat derivative of my comment above too (ie. is v-a-s not getting identified by default for some reason, and then the natural work flow - either naive or fenix-driven - is to make it the system manager which doesn't work.)

For the record, on my side, making it the device assistant means v-a-s still works/is callable. But keeps the mic on. Another issue I have to post.

The full noisy log, invoking from Switftkey, is here.
It did ask for mic permissions when I first launched it, though. Going and looking in Vosk's app permissions it has "Microphone" as "Allow all the time" and "No permissions denied".

Thanks for the log. The other thing that's now sticking out is the first line of these three:

03-27 13:48:59.530 24222 24222 V GrantPermissionsActivity: Permission grant result requestId=107861796779005561 callingUid=10271 callingPackage=org.vosk.service permission=android.permission.RECORD_AUDIO isImplicit=false result=5
03-27 13:48:59.531 24222 24222 V GrantPermissionsActivity: Permission grant result requestId=107861796779005561 callingUid=10271 callingPackage=org.vosk.service permission=android.permission.WRITE_EXTERNAL_STORAGE isImplicit=false result=1
03-27 13:48:59.531 24222 24222 V GrantPermissionsActivity: Permission grant result requestId=107861796779005561 callingUid=10271 callingPackage=org.vosk.service permission=android.permission.READ_EXTERNAL_STORAGE isImplicit=true result=1

Will try to look into if the result for the RECORD_AUDIO being different means in the next couple of days.

Edited: clarity

from vosk-android-service.

drew-sinha avatar drew-sinha commented on September 27, 2024

Update @xenotropic : still working on your particular bug (sorry, my day work has kept me quite busy the last couple of weeks). Have a couple of issues I opened up on here that should hopefully also make the ui/user workflow a little bit easier to miss some of the pitfalls you and I discussed.

Of note regarding the issue that you brought up with AnySoft somehow only being compatible with Google STT... , I figured out that AnySoftKeyboard (as of AnySoftKeyboard/AnySoftKeyboard@498dab3) merely prioritizes Google STT when searching for the right voice recognition handler. But it's hardcoded in a way that is a little hard to find. I ... lol... put in a PR at ASK to just switch the priorities to make Google STT a fallback, and vosk easier to use. See AnySoftKeyboard/AnySoftKeyboard#3690

Edit: I accidentally obliterated the old PR at ASK; updated the PR link above (sorry @nshmyrev 's comment below points to a non-starter now.)

from vosk-android-service.

nshmyrev avatar nshmyrev commented on September 27, 2024

Update @xenotropic : still working on your particular bug (sorry, my day work has kept me quite busy the last couple of weeks). Have a couple of issues I opened up on here that should hopefully also make the ui/user workflow a little bit easier to miss some of the pitfalls you and I discussed.

Of note regarding the issue that you brought up with AnySoft somehow only being compatible with Google STT... , I figured out that AnySoftKeyboard (as of AnySoftKeyboard/AnySoftKeyboard@498dab3) merely prioritizes Google STT when searching for the right voice recognition handler. But it's hardcoded in a way that is a little hard to find. I ... lol... put in a PR at ASK to just switch the priorities to make Google STT a fallback, and vosk easier to use. See AnySoftKeyboard/AnySoftKeyboard#3689

Hey, this is nice. Thanks for figuring this out!

from vosk-android-service.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.