GithubHelp home page GithubHelp logo

bearn777 / laravel-google-speech-to-text Goto Github PK

View Code? Open in Web Editor NEW

This project forked from noardcode/laravel-google-speech-to-text

0.0 0.0 0.0 974 KB

This Laravel package provides a convenient interface for the Google Speech to Text API.

License: MIT License

PHP 100.00%

laravel-google-speech-to-text's Introduction

Laravel Google Speech to Text

This Laravel package provides a convenient interface for the Google Speech to Text API.

Latest Version MIT Licensed Total Downloads

Prerequisites

  • The gRPC packages is required when enabling the word time offsets option
    • Step 1: Run pecl install grpc
    • Step 2: Add extension=grpc.so to php.ini
      • grpc.dll on windows

Installation

composer require noardcode/speech-to-text

Getting started

  • Open Google Cloud Console and add the Cloud Speech-to-Text API to API's en services in your project.
  • Create an Google Service Account with the following role: Cloud Speech Service Agent
    • Make sure to generate a Service Account Key this file will be used for Authentication.
  • Run php artisan vendor:publish --provider="Noardcode\SpeechToText\SpeechToTextServiceProvider"
    • This will create a speech-to-text.php file in your config folder.
  • In speech-to-text.php change the following
/*
|--------------------------------------------------------------------------
| Google Service Account
|--------------------------------------------------------------------------
*/
'service-account' => '/path/to/service-account.json',

For a detailed documentation about service accounts see: https://cloud.google.com/docs/authentication/production

Basic examples

// Run on Google Cloud Storage object
resolve(SpeechToText::class)->run('gs://your-bucket-name/path-to-object');

// Run on stored audio file (needs to be: less than 10MB in size and less than 1 minute in length)
resolve(SpeechToText::class)
    ->setAudio(new FilesystemAudio)
    ->run('/path/to/audio-file');

// Using different types of transcripts (e.g. include word time offsets (startTime and endTime))
resolve(SpeechToText::class)->run('gs://your-bucket-name/path-to-object')
    ->setTranscript(new WordTimeOffsets)
    ->run('gs://your-bucket-name/path-to-object');

Settings

You can change the default settings by publishing the config file and changing the following values.

/*
|--------------------------------------------------------------------------
| Default parameters injected by the Service Provider
|--------------------------------------------------------------------------
*/
'defaults' => [
    'language' => 'en-US',
    'encoding' => \Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::LINEAR16,
    'sampleRateHertz' => 44100
]

Or change the settings when you have an instance of the class.

$speechToText = resolve(SpeechToText::class)
    ->setLanguageCode('en-US')
    ->setEncoding(\Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::LINEAR16)
    ->setSampleRateHertz(44100);

Audio types

By default the SpeechToText class will be passed the a GoogleCloudStorageAudio class. This class tells the SpeechToText class how to create the RecognitionAudio class from the Google Speech to Text Package. If you want to create the RecognitionAudio in in different way, e.g. a file from your local filesystem, you will need to set an other Audio class that implements the AudioInterface.

// Run on audio file on local filesyem 
resolve(SpeechToText::class)
    ->setAudio(new FilesystemAudio)
    ->run('/path/to/audio-file');

Side note: Google only supports sending inline files that are: less than 10MB in size and less than 1 minute in length

Transcripts

By default the SpeechToText class will be passed the a BasicTranscript class. This class tells the SpeechToText class how to handle the response from the SpeechClient class from the Google Speech to Text Package. If you want to handle the response from the SpeechClient in in different way, e.g. including the word time offsets, you will need to set an other Transcript class that implements the TranscriptInterface.

// Using different types of transcripts (e.g. include word time offsets (startTime and endTime))
resolve(SpeechToText::class)->setTranscript(new WordTimeOffsets())
    ->run('gs://your-bucket-name/path-to-object');

Example output of WordTimeOffsets transcript

array:2 [
  'transcript' => array:10 [
      0 => array:3 [
        "transcript" => "hello world"
        "confidence" => 0.96761703491211
        "words" => array:9 [
          0 => array:3 [
            "word" => "hello"
            "startTime" => 0
            "endTime" => 0.3
          ]
          1 => array:3 [
            "word" => "world"
            "startTime" => 0.3
            "endTime" => 0.5
          ]
          ...
        ]
      ]
      1 => array:3 [
        "transcript" => "foo bar buz"
        "confidence" => 0.74065810441971
        "words" => array:7 [
            ...
        ]
      ]
  ]
  'words' => array:45 [
      0 => array:3 [
         "word" => "hello"
         "startTime" => 0
         "endTime" => 0.3
      ]
      1 => array:3 [
          "word" => "world"
          "startTime" => 0.3
          "endTime" => 0.5
      ]
      ...
  ]
]

Changelog

Please see CHANGELOG for more information what has changed recently.

Contributing

Contributions are welcome and will be fully credited. We accept contributions via Pull Requests on Github.

Pull Requests

  • PSR-2 Coding Standard - The easiest way to apply the conventions is to install PHP Code Sniffer.
  • Document any change in behaviour - Make sure the README.md and any other relevant documentation are kept up-to-date.
  • Create feature branches - Don't ask us to pull from your master branch.
  • One pull request per feature - If you want to do more than one thing, send multiple pull requests.

License

The MIT License (MIT). Please see License File for more information.

laravel-google-speech-to-text's People

Contributors

royvoetman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.