asticode / go-astideepspeech Goto Github PK

View Code? Open in Web Editor NEW

175.0 10.0 23.0 36 KB

Golang bindings for Mozilla's DeepSpeech speech-to-text library

License: MIT License

Go 57.63% C++ 31.45% C 10.92%

golang go speech-recognition speech-to-text deepspeech

go-astideepspeech's People

Contributors

Stargazers

Watchers

go-astideepspeech's Issues

Supporting v0.1.1 and upcoming releases

It's great that there is a Go binding, however it seems to target v0.1.0. We did a lot of changes now, v0.1.1 is available and current master branch has also a lot of new shining.

Do you have any plan on pushing updates? How can we be of help?

'Metadata' does not name a type

When trying to build the example program with the default models and audio I get the following errors:

# github.com/asticode/go-astideepspeech
deepspeech.cpp:30:13: error: 'Metadata' does not name a type
             Metadata* sttWithMetadata(const short* aBuffer, unsigned int aBufferSize, unsigned int aSampleRate)
             ^~~~~~~~
deepspeech.cpp:60:5: error: 'Metadata' does not name a type
     Metadata* STTWithMetadata(ModelWrapper* w, const short* aBuffer, unsigned int aBufferSize, int aSampleRate)
     ^~~~~~~~
deepspeech.cpp:65:36: error: 'Metadata' was not declared in this scope
     double Metadata_GetProbability(Metadata* m)
                                    ^~~~~~~~
deepspeech.cpp:65:46: error: 'm' was not declared in this scope
     double Metadata_GetProbability(Metadata* m)
                                              ^
deepspeech.cpp:70:30: error: 'Metadata' was not declared in this scope
     int Metadata_GetNumItems(Metadata* m)
                              ^~~~~~~~
deepspeech.cpp:70:40: error: 'm' was not declared in this scope
     int Metadata_GetNumItems(Metadata* m)
                                        ^
deepspeech.cpp:75:5: error: 'MetadataItem' does not name a type
     MetadataItem* Metadata_GetItems(Metadata* m)
     ^~~~~~~~~~~~
deepspeech.cpp:80:37: error: 'MetadataItem' was not declared in this scope
     char* MetadataItem_GetCharacter(MetadataItem* mi)
                                     ^~~~~~~~~~~~
deepspeech.cpp:80:51: error: 'mi' was not declared in this scope
     char* MetadataItem_GetCharacter(MetadataItem* mi)
                                                   ^~
deepspeech.cpp:85:34: error: 'MetadataItem' was not declared in this scope
     int MetadataItem_GetTimestep(MetadataItem* mi)
                                  ^~~~~~~~~~~~
deepspeech.cpp:85:48: error: 'mi' was not declared in this scope
     int MetadataItem_GetTimestep(MetadataItem* mi)
                                                ^~
deepspeech.cpp:90:37: error: 'MetadataItem' was not declared in this scope
     float MetadataItem_GetStartTime(MetadataItem* mi)
                                     ^~~~~~~~~~~~
deepspeech.cpp:90:51: error: 'mi' was not declared in this scope
     float MetadataItem_GetStartTime(MetadataItem* mi)
                                                   ^~
deepspeech.cpp:125:13: error: 'Metadata' does not name a type
             Metadata* finishStreamWithMetadata()
             ^~~~~~~~
deepspeech.cpp:160:5: error: 'Metadata' does not name a type
     Metadata* FinishStreamWithMetadata(StreamWrapper* sw)
     ^~~~~~~~
deepspeech.cpp: In function 'void FreeString(char*)':
deepspeech.cpp:167:9: error: 'DS_FreeString' was not declared in this scope
         DS_FreeString(s);
         ^~~~~~~~~~~~~
deepspeech.cpp:167:9: note: suggested alternative: 'FreeString'
         DS_FreeString(s);
         ^~~~~~~~~~~~~
         FreeString
deepspeech.cpp: At global scope:
deepspeech.cpp:170:23: error: variable or field 'FreeMetadata' declared void
     void FreeMetadata(Metadata* m)
                       ^~~~~~~~
deepspeech.cpp:170:23: error: 'Metadata' was not declared in this scope
deepspeech.cpp:170:33: error: 'm' was not declared in this scope
     void FreeMetadata(Metadata* m)

It appears that the 0.4.0 version of the DeepSpeech native header that's linked to in the readme does not define a type called Metadata, however the latest version of the header here does.

I went through the commit history but can't seem to find anywhere where the Metadata type was removed for 0.4.0, did you perhaps link to the wrong version of the deepspeech native client in the readme?

how to translate realtime pcm data to text?

hi, I just wanna translate pcm data to text in realtime, the pcm data is decoded by ffmpeg from live stream, however I can't get the result successfully. can you fix it? here are the codes, feed the pcm data all the time, and translate to texts in 5 seconds:

var stream *astideepspeech.Stream
func detectVoice(sample []byte){
if stream == nil {
m, _ := astideepspeech.New(model)
if err := m.SetBeamWidth(beamWidth); err != nil {
fmt.Println(fmt.Sprintf("Failed setting beam width: %v", err))
return
}
if err := m.EnableExternalScorer(scorer); err != nil {
fmt.Println(fmt.Sprintf("Failed enabling external scorer: %v", err))
return
}
if err := m.SetScorerAlphaBeta(alpha, beta); err != nil {
fmt.Println(fmt.Sprintf("Failed setting scorer hyperparameters: %v", err))
return
}
var err error
stream,err = m.NewStream()
if err != nil {
fmt.Println(fmt.Sprintf("Failed create stream: %v", err))
return
}
}
var d []int16
for _, v := range sample {
d = append(d, int16(v))
}
stream.FeedAudioContent(d)
}

func init(){
Println("get stt result in 5 seconds..........")
go func(){
var ch chan int
ticker := time.NewTicker(time.Second * 5)
go func() {
for range ticker.C {
if stream!=nil{
result,err := stream.IntermediateDecode()
if err != nil {
fmt.Println(fmt.Sprintf("Failed converting speech to text: %v", err))
return
}
fmt.Println("result: ", result)
}
}
ch <- 1
}()
<-ch
}()
}

Lib is already compatible with libspeech.so v0.9

Hello I am running this lib against deepspeech.so v0.9 and no problems so far :)
Feel free to close this just wanted to share this as the readme says 0.8.
Thanks for this lib it is amazing!

[Question] Does this go-binding utilizes GPU for deepspeech engine?

Mozilla's DeepSpeech has 2 installation method and one of them is using pip3 install deepspeech-gpu which does utilization of GPU for transcription engine, does this go-binding offers the same in by default nature or any special way?

error in "go get go get -u github.com/asticode/go-astideepspeech/... "

I followed your README.md and got the following error when installing astideepspeech at /tmp/deepspeech directory. Please assist. Many thanks.

$ go get -u github.com/asticode/go-astideepspeech/...
go: finding github.com/cryptix/wav latest
go: finding github.com/cheekybits/is latest
github.com/asticode/go-astideepspeech
ld: library not found for -ldeepspeech
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Support for 0.8

We have released 0.8 release, adding support on that binding might be a good thing. I might be able to send PR.

Update with new API

We recently merged a PR that exposes new information, via new API calls: mozilla/DeepSpeech@a009361#diff-0317a0e76ece10e0dba742af310a2362

This allows access to timing information. We have not yet updated our own bindings by exposing this, I can likely try and take care of that here as well.

Updating bindings ?

Hello,

We will have 0.4.0 soon, do you still maintain those bindings ? We have a streaming API for some time now, it's quite useful

Update to new v0.6 API

We made some (breaking) changes to the API, this needs to be reflected here.

v0.7 available

Hello @asticode we have a newer version available, with some API changes :)

Upcoming 1.0 and renaming

Hello,

1.0 is close, and part of the work for that involves one painful change: we need to rename the project to Mozilla Voice STT.
It also means library itself and the API need to get renamed: libdeepspeech.so -> libmozilla_voice_stt.so and API DS_* becomes STT_*.

Besides this renaming, there should be no other change. I can help and prepare a PR to update once we have completed some painful renaming on our side (CI, packages everywhere).

For clarity, it might be good if you could rename your binding as well, but we do not want to force you as well.

Change LM parameters

To be in sync with upstream mozilla/DeepSpeech@fa7cb1a

asticode / go-astideepspeech Goto Github PK

go-astideepspeech's People

Contributors

Stargazers

Watchers

Forkers

go-astideepspeech's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs