GithubHelp home page GithubHelp logo

Comments (8)

endoplasmic avatar endoplasmic commented on September 26, 2024

Solid find!

I went deep digging, but ended up just triggering the end() method whenever we need to change the volume.

from google-assistant.

ppisljar avatar ppisljar commented on September 26, 2024

I am sure there are more cases like this, so detecting google's response would be the right way to go.
imagine something like google, turn on the light (light comes on, no voice response ....)

from google-assistant.

endoplasmic avatar endoplasmic commented on September 26, 2024

Does that happen for you? Whenever I do it, she always replies with something like "Okay, turning on xxxxx light" or "Turning on 2 lights", something like that.

Yah, I agree with you though, I need to find out if the python version handles it differently, but I have to admit that my python skills are pretty rough.

from google-assistant.

ppisljar avatar ppisljar commented on September 26, 2024

one (not the best) way would be with a timeout (if in 100ms after end of utterance first audio packet is not received we end the conversation) ....

from google-assistant.

endoplasmic avatar endoplasmic commented on September 26, 2024

If you wanted to implement something like that, you could do it on your server.

Start a timeout once you get transcription that fires conversation.end(). clearTimeout when you get the audio-data event.

I'm not a huge fan of implementing a timeout since we don't know much about network traffic, but at least with the events that are fired, you can implement it.

from google-assistant.

ppisljar avatar ppisljar commented on September 26, 2024

i am looking at the way its done in python sample app (where it works perfectly for me)... https://github.com/googlesamples/assistant-sdk-python/blob/master/google-assistant-sdk/googlesamples/assistant/grpc/pushtotalk.py#L109

there is no extra condition anywhere, it just loops over the response packets then ends the conversation, so seems there is a clear notification from the server when the last packet was sent.

looking at the comment in https://github.com/googleapis/googleapis/blob/master/google/assistant/embedded/v1alpha1/embedded_assistant.proto#L255

// This event indicates that the server has detected the end of the user's
    // speech utterance and expects no additional speech. Therefore, the server
    // will not process additional audio (although it may subsequently return
    // additional results). The client should stop sending additional audio
    // data, half-close the gRPC connection, and wait for any additional results
    // until the server closes the gRPC connection.

also seems to suggest that the stream will be closed after last packet is sent ?

from google-assistant.

endoplasmic avatar endoplasmic commented on September 26, 2024

Yah, I get a bit lost in the python stuff sadly. Could use a hand getting through that.

As far as your second link, that's actually only to tell you that the user is finished speaking and to not send any more audio to the server. Once you get that message, you'll get the audio (if there is any) right after that.

from google-assistant.

ppisljar avatar ppisljar commented on September 26, 2024

yeah, the function does that, but the comment seems to suggest that the server will close gRPC connection after its done sending data ?

..., and wait for any additional results until the server closes the gRPC connection.

https://developers.google.com/assistant/sdk/reference/library/python/ ... the python library seems to provide more events, one of them being

ON_NO_RESPONSE = 8
The Assistant successfully completed its turn but has nothing to say.

but the same doesn't look to be available in the gRPC api ...

from google-assistant.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.