GithubHelp home page GithubHelp logo

botttt's Introduction

Speech To Text Bot Sample

A sample bot that illustrates how to use the Microsoft Cognitive Services Bing Speech API to analyze an audio file and return the text.

Deploy to Azure

Prerequisites

The minimum prerequisites to run this sample are:

  • Latest Node.js with NPM. Download it from here.
  • The Bot Framework Emulator. To install the Bot Framework Emulator, download it from here. Please refer to this documentation article to know more about the Bot Framework Emulator.
  • Bing Speech Api Key. You can obtain one from Microsoft Cognitive Services Subscriptions Page.
  • [Recommended] Visual Studio Code for IntelliSense and debugging, download it from here for free.
  • This sample currently uses a free trial Microsoft Cognitive service key with limited QPS. Please subscribe to Bing Speech Api services here and update the MICROSOFT_SPEECH_API_KEY key in .env file to try it out further.

Usage

Attach an audio file (wav format).

Code Highlights

Microsoft Cognitive Services provides a Speech Recognition API to convert audio into text. Check out Bing Speech API for a complete reference of Speech APIs available. In this sample we are using the Speech Recognition API using the REST API.

The main components are:

  • speech-service.js: is the core component illustrating how to call the Bing Speech RESTful API.
  • app.js: is the bot service listener receiving messages from the connector service and passing them down to speech-service.js and doing text processing on them.

In this sample we are using the API to get the text and send it back to the user. Check out the use of the speechService.getTextFromAudioStream(stream) method in app.js.

if (hasAudioAttachment(session)) {
    var stream = getAudioStreamFromMessage(session.message);
    speechService.getTextFromAudioStream(stream)
        .then(function (text) {
            session.send(processText(text));
        })
        .catch(function (error) {
            session.send('Oops! Something went wrong. Try again later.');
            console.error(error);
        });
}

And here is the implementation of speechService.getTextFromAudioStream(stream) in speech-service.js

exports.getTextFromAudioStream = function (stream) {
    return new Promise(
        function (resolve, reject) {
            if (!speechApiAccessToken) {
                try {
                    authenticate(function () {
                        streamToText(stream, resolve, reject);
                    });
                } catch (exception) {
                    reject(exception);
                }
            } else {
                streamToText(stream, resolve, reject);
            }
        }
    );
};

Outcome

You will see the following when connecting the Bot to the Emulator and send it an audio file and a command:

Input:

"What's the weather like?"

Output:

Sample Outcome

More Information

To get more information about how to get started in Bot Builder for Node and Microsoft Cognitive Services Bing Speech API please review the following resources:

botttt's People

Contributors

xiaofengcy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.