dwks / silvius Goto Github PK

View Code? Open in Web Editor NEW

100.0 100.0 28.0 79 KB

Kaldi-based speech recognition system + grammar

Home Page: http://voxhub.io/silvius

License: BSD 2-Clause "Simplified" License

Python 96.79% Shell 2.69% Batchfile 0.52%

silvius's People

Contributors

Stargazers

Watchers

silvius's Issues

How to scratch more than 9 characters?

I can say scratch <any digit> to repeatly hit backspace (with the exception of number 2, see #20). However, I can't say scratch three zero or scratch number three number zero to repeat 30 times.

Mozilla open sourced it's voice recognition model and data set.

Mozilla released speech recognition model and voice data set that could be integrated. I'm curious to hear what people think.

https://blog.mozilla.org/blog/2017/11/29/announcing-the-initial-release-of-mozillas-open-source-speech-recognition-model-and-voice-dataset/

Adding wake/sleep word

I think the sleep word can simply be "sleep". Maybe the wake word can be "Hey wiretap"

Integrate with X11

As per #16 (comment) I've started a fork of screenkey (at https://github.com/crypdick/silvius-screenkey), and want to have it display the buffer as you speak.

Keywords forbidden inside "sentence" and "phrase" commands

to: Saying phrase or sentence followed by any digit was added with #12, with the exception of two (which is usually recognized as "to").

Also forbidden are number, late, rate, left, right, etc.

Expected functionality: phrase and sentence mode should input the raw words without parsing, and if you want to insert a symbol you wait for the phrase to be parsed and insert it outside of phrase mode.

Implement shift and super as modifiers

I will submit a PR for this soon as discussed

How to make macros?

How do you insert a chain of characters? I want to make a few macros. For example, if I'd like go word left to input Ctrl+left, select word left to input Ctrl+Shift+left, and delete word left to input Ctrl+Shift+left delete.

adding grammar - new characters/commands

Hello,

first of all a wonderful project and presentation(found it on youtube). I myself have a problem with writing due to my neck injury. I find this project very helpful.

Can you please post an example how I can extend the grammar?

e.g. I can see single/double quotes are not working, can you help me fix that?
noticed one can not include numbers (number three - this is not working, would be nice)
is it possible to train the system to my voice? It has trouble differencing between right and rate in my case, not a native english speaker it is hard to pronounce those so that it recognizes them. Perhaps if you show how I can change the word for left bracket it would help :)
there is no curly braces command/word
the letter x can not be written, as xray is not recognised. x. ray is many times written instead (as seen when I run the script) - edit: I just found it is expert for x :)

Please add a bitcoin address to the page so I can tip you :)

Option to ignore invalid tokens

The recognition server has a bias to hear the words "the" and "and" at the beginning of an audio snippit, which spoils lots of commands. It would be nice to ignore all invalid tokens, or to manually create a rule that pops "the" and "and" from the buffer if it's at index 0.

Cross platform client

Linux is great but Silvius needs a cross-platform client and preferably architecture agnostic. We need to make this as easy as possible to experiment and not just for programmers. Easy experimentation leads to adoption, then innovation. Therefore we might start evaluating implementation of libraries that would work cross-platform most likely but not necessarily python.

I would say we need to discuss how the project is currently structure and therefore our approach about planning and implementation.

Available Hardware for testing purposes.
Microsoft 10 64-bit(Linux Ubuntu system installed), Linux (any flavor) amd64, oDroid c2(64bit ARMv8), Raspberry Pi 3(64bit ARMv7), Pine A64(ARMv8), oDroid xu4(ARMv7),
Emulated Mac OS (VirtualBox)

#8 could be merged

Chaining modifiers and using modifier followed by digit

I use the i3 windows manager, and I use the Super key to navigate windows. I added support for the super modifier (here: https://github.com/crypdick/silvius/blob/master/grammar/parse.py#L233) and it works for Super+Letter combinations. However, this does not work with Super+digit or chained modifiers (e.g. Super+Shift+q).

New speech model introduces [COUGH], and [UH] but won't parse these

Silvius. What a cool project. And so much work to make. Thank you David and team.

Learning Silvius has been a hobby since August 2018. Three months ago I broke two
metacarpals in a bicycle incident and my hobby became a lifeline to productivity.

One thing though. When I upgraded to the new Silvius speech model
[1], my user experience actually crashed.

What I expect:

I run Silvius. I get really good recognition. For example, if I say
'arch tango bravo' I see this:

: LISTENING TO MICROPHONE
: arch tango bravo
: > arch tango bravo
: [arch, tango, bravo, END]
: chain {
: char ['a']
: char ['t']
: char ['b']
: }
: /usr/bin/xdotool key a key t key b

What happens instead. After I upgraded to the new speech model [1],
now this happens.

I run Silvius. I get a stream of words interspersed with [COUGH] and
[UM] and very poor recognition. For example, this is just an attempt
to get 'tango bravo' recognized:

: [COUGH]
: > [COUGH]
: [ANY, END]
: Error: Unexpected token ANY' (word number 1) : tango : > tango : [tango, END] : chain { : char ['t'] : } : /usr/bin/xdotool key t : [UH] : > [UH] : [ANY, END] : Error: Unexpected tokenANY' (word number 1)
: [UM]
: > [UM]
: [ANY, END]
: Error: Unexpected token `ANY' (word number 1)

Has anyone else had this experience? Any clue as to what I'm missing?
It seems obvious to me at least that the python grammar is supposed to
ignore [UH] and [COUGH], but mine is trying to parse it as part of the
BNF. What happened in my obviously failed install?

thank you,

James

References

[1] dwk post of 29/11/2018 titled "new silvius speech model"
(https://groups.google.com/forum/#!topic/silvius/1NNmRNLVyC0 accessed
2020-01-23)

Saying a number in a sentence throws an error

If I try to say "phrase something something something one" I get this error:

[phrase, ANY, ANY, ANY, one, END]
Error: Unexpected token `one' (word number 5)

Works with every number I tried:

> phrase five
[phrase, five, END]
Error: Unexpected token `five' (word number 2)

Improving recognition accuracy

Hey, thanks for making this amazing tool! I think it could work well for me but I'm running into some issues with the speech recognition and I'm hoping to get some input on the best resolution.

Some of my commands are recognized first time but most require multiple repeats and some are never recognized regardless of how many times I repeat. I've tried both of the public services and the beta is definitely better but still not usable.

I think the issue could be my English accent or my microphone quality. I'm using a hyperx cloud silver gaming headset that I assumed would have a decent enough mic but maybe not. What do you think?

These are the mic specs:

* Element: Electret condenser microphone
* Polar pattern: Uni-directional, noise-cancelling
* Frequency response: 50Hz-18,000 Hz
* Sensitivity: -39dBV (0dB=1V/Pa,1kHz)

Store custom grammars in separate folder

Allow modifying buffer

Right now, you have to wait after a period of silence before the buffer is parsed. It would be great to force the buffer to be parsed (suggested word: "slurp") or to discard the buffer entirely (suggested word: "spit"). Also, popping the last word added to the stack (with "oops" or "scratch").

No silvius-backend link

There is mention of a server-client architecture, yet no active reference to the server component.

voxhub.io appears to be offline.

Here is an archived link: https://web.archive.org/web/20190425050145/http://voxhub.io/silvius

From there I found the server component: https://github.com/dwks/silvius-backend

removing a stream sound bites

sorry, if pieces of the result of streaming audio mic conducted by mic.py, layout frame her pieces where?

Best regards
thanks

Using two for a repeat is broken

For example, scratch two:

> scratch to
[scratch, to, END]
Error: Unexpected token `to' (word number 2)

> down to
[down, to, END]
Error: Unexpected token `to' (word number 2)

Is there any way to reduce/stop this?

(on Mac, Sierra)

thanks!

Modifier + direction causes Silvius client to crash

This makes it hard to navigate i3. Example:

control left 
> control left
[control, left, END]
 chain {
     mod_plus_key ['ctrl'] {
         movement [left]
     }
 }
Traceback (most recent call last):
  File "grammar/main.py", line 31, in <module>
    execute(ast, f == sys.stdin)
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 103, in execute
    ExecuteCommands(ast, real)
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 12, in __init__
    self.postorder_flat()
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 26, in postorder_flat
    func(node)
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 32, in n_chain
    self.postorder_flat(n)
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 26, in postorder_flat
    func(node)
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 38, in n_mod_plus_key
    self.automator.mod_plus_key(node.meta, node.children[0].meta[0])
  File "/home/shit/bin/voice2code/silvius-crypdick/grammar/execute.py", line 98, in mod_plus_key
    if(len(k) > 1 and k != 'plus' and k != 'apostrophe' and k != 'period' and k != 'minus'): k = k.capitalize()
AttributeError: Token instance has no attribute '__len__'
super left 
Exception in thread WebSocketClient:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/shit/.local/lib/python2.7/site-packages/ws4py/websocket.py", line 528, in run
    if not self.once():
  File "/home/shit/.local/lib/python2.7/site-packages/ws4py/websocket.py", line 410, in once
    if not self.process(self.buf[:requested]):
  File "/home/shit/.local/lib/python2.7/site-packages/ws4py/websocket.py", line 480, in process
    self.received_message(s.message)
  File "stream/mic.py", line 113, in received_message
    sys.stdout.flush()
IOError: [Errno 32] Broken pipe

dwks / silvius Goto Github PK

silvius's People

Contributors

Stargazers

Watchers

Forkers

silvius's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs