GithubHelp home page GithubHelp logo

revdotcom / revai-node-sdk Goto Github PK

View Code? Open in Web Editor NEW
21.0 41.0 12.0 1.64 MB

Node.js SDK for the Rev AI API

License: MIT License

TypeScript 76.96% JavaScript 22.98% Dockerfile 0.06%
speech-recognition speech-to-text realtime sdk rev captions nodejs revai

revai-node-sdk's Introduction

Documentation

See the API docs for more information about the API.

Examples

Examples can be found in the examples/ directory

Installation

To install the package, run:

npm install revai-node-sdk

Support

We support Node 8, 10, 12, 14, 16 and 17.

Usage

All you need to get started is your Access Token, which can be generated on your Settings Page. Create a client with the given Access Token:

import { RevAiApiClient, RevAiApiDeployment, RevAiApiDeploymentConfigMap } from 'revai-node-sdk';

// Initialize your client with your Rev AI access token
const accessToken = "<ACCESS_TOKEN>";

// Optionally set the specific Rev AI deployment of your account, defaults to the US deployment.
// Learn more about Rev AI's global deployments at https://docs.rev.ai/api/global-deployments.
const client = new RevAiApiClient({ token: accessToken, deploymentConfig: RevAiApiDeploymentConfigMap.get(RevAiApiDeployment.US) });

Checking credits remaining

const accountInfo = await client.getAccount();

Submitting a job

Once you've set up your client with your Access Token sending a file is easy!

// You can submit a local file
const job = await client.submitJobLocalFile("./path/to/file.mp4");

// or submit via a public url
const jobOptions = { source_config: { url: "https://www.rev.ai/FTC_Sample_1.mp3" } }
const job = await client.submitJob(jobOptions);

// or from audio data, the filename is optional
const stream = fs.createReadStream("./path/to/file.mp3");
const job = await client.submitJobAudioData(stream, "file.mp3");

You can request transcript summary.

const job = await client.submitJobLocalFile("./path/to/file.mp4", {
    language: "en",
    summarization_config: {
        type: 'bullets'
    }
});

You can request transcript translation into up to five languages.

const job = await client.submitJobLocalFile("./path/to/file.mp4", {
    language: "en",
    translation_config: {
        target_languages: [{
            language: 'es',
            model: 'premium'
        }]
    }
});

You can also submit a job to be handled by a human transcriber using our Human Transcription option.

const job = await client.submitJobLocalFile("./path/to/file.mp4", {
    transcriber: "human",
    verbatim: false,
    rush: false,
    test_mode: true,
    segments_to_transcribe: [{
        start: 1.0,
        end: 2.4
    }],
    speaker_names: [{
        display_name: "Augusta Ada Lovelace"
    },{
        display_name: "Alan Mathison Turing"
    }]
});

job will contain all the information normally found in a successful response from our Submit Job endpoint.

If you want to get fancy, both send job methods can take a RevAiJobOptions object containing optional parameters. These are described in the request body of the Submit Job endpoint.

Submitting urls with authorization headers

Both the source_config and notification_config job options support using a customer-provided authorization header to access the URLs. This optional argument should be in the format { "Authorization": "TokenScheme TokenValue" }

Example:

var notificationConfig = { url: 'https://example.com', auth_headers: { "Authorization": "Bearer <token>" } };

For more information see https://github.com/revdotcom/revai-node-sdk/blob/develop/examples/async_transcribe_media_from_url.js

Checking your job's status

You can check the status of your transcription job using its id

const jobDetails = await client.getJobDetails(job.id);

jobDetails will contain all information normally found in a successful response from our Get Job endpoint

Checking multiple files

You can retrieve a list of transcription jobs with optional parameters

const jobs = await client.getListOfJobs();

// limit amount of retrieved jobs
const jobs = await client.getListOfJobs(3);

// get jobs starting after a certain job id
const jobs = await client.getListOfJobs(undefined, 'Umx5c6F7pH7r');

jobs will contain a list of job details having all information normally found in a successful response from our Get List of Jobs endpoint

Deleting a job

You can delete a transcription job using its id

await client.deleteJob(job.id);

All data related to the job, such as input media and transcript, will be permanently deleted. A job can only by deleted once it's completed (either with success or failure).

Getting your transcript

Once your file is transcribed, you can get your transcript in a few different forms:

// as plain text
const transcriptText = await client.getTranscriptText(job.id);

// or as an object
const transcriptObject = await client.getTranscriptObject(job.id);

// or if you requested transcript translation(s)
const translatedTranscriptTest = await client.getTranslatedTranscriptText(job.id, 'es');

The text output is a string containing just the text of your transcript. The object form of the transcript contains all the information outlined in the response of the Get Transcript endpoint when using the json response schema.

Any of these outputs can we retrieved as a stream for easy file writing:

const textStream = await client.getTranscriptTextStream(job.id);
const transcriptStream = await client.getTranscriptObjectStream(job.id);

Getting captions output

Another way to retrieve your file is captions output. We support both .srt and .vtt outputs. See below for an example showing how you can get captions as a readable stream. If your job was submitted with multiple speaker channels you are required to provide the id of the channel you would like captioned.

const captionsStream = await client.getCaptions(job.id, CaptionType.SRT);

// or if you requested transcript translation(s)
const translatedCaptionsStream = await client.getTranslatedCaptions(job.id, 'es');

// with speaker channels
const channelId = 1;
const captionsStream = await client.getCaptions(job.id, CaptionType.VTT, channelId);

Getting transcript summary

If you requested transcript summary, you can retrieve it as plain text or structured object:

// as text
const transcriptSummaryText = await client.getTranscriptSummaryText(job.id);

// as object
const transcriptSummaryJson = await client.getTranscriptSummaryObject(job.id);

Streaming Audio

In order to stream audio, you will need to setup a streaming client and a media configuration for the audio you will be sending.

import { RevAiApiClient, RevAiApiDeployment, RevAiApiDeploymentConfigMap } from 'revai-node-sdk';

 // Initialize audio configuration for the streaming client
const audioConfig = new AudioConfig()

// Optionally set the specific Rev AI deployment of your account, defaults to the US deployment.
// Learn more about Rev AI's global deployments at https://docs.rev.ai/api/global-deployments.
const streamingClient = new RevAiStreamingClient({ token: "<ACCESS_TOKEN>", deploymentConfig: RevAiApiDeploymentConfigMap.get(RevAiApiDeployment.US) }, audioConfig);

You can set up event responses for your client's streaming sessions. This allows you to handle events such as the connection closing, failing, or successfully connecting! Look at the examples for more details.

streamingClient.on('close', (code, reason) => {
    console.log(`Connection closed, ${code}: ${reason}`);
});

streamingClient.on('connect', connectionMessage => {
    console.log(`Connected with job id: ${connectionMessage.id}`);
})

Now you will be able to start the streaming session by simply calling the streamingClient.start() method! You can supply an optional SessionConfig object to the function as well in order to provide additional information for that session, such as metadata, or a custom vocabulary's ID to be used with your streaming session.

const sessionConfig = new SessionConfig(metadata='my metadata', customVocabularyID='myCustomVocabularyID');

const stream = streamingClient.start(sessionConfig);

You can then stream data to this stream from a local file or other sources of your choosing and the session will end when the data stream to the stream session ends or when you would like to end it, by calling streamingClient.end(). For more details, take a look at our examples.

Submitting custom vocabularies

You can now submit any custom vocabularies independently through the new CustomVocabularies client! The main benefit is that users of the SDK can now submit their custom vocabularies for preprocessing and then include these processed custom vocabularies in their streaming jobs.

Below you can see an example of how to create, submit and check on the status and other associated information of your submitted custom vocabulary!

For more information, check out our examples.

import { RevAiCustomVocabulariesClient } from 'revai-node-sdk';

// Initialize your client with your Rev AI access token
const accessToken = "<ACCESS_TOKEN>";
const client = new RevAiCustomVocabulariesClient(accessToken);

// Construct custom vocabularies object and submit it through the client
const customVocabularies = [{phrases: ["Noam Chomsky", "Robert Berwick", "Patrick Winston"]}];
const customVocabularySubmission = await client.submitCustomVocabularies(customVocabularies);

// Get information regarding the custom vocabulary submission and its progress
const customVocabularyInformation = await client.getCustomVocabularyInformation(customVocabularySubmission.id)

// Get a list of information on previously submitted custom vocabularies
const customVocabularyInformations = await client.getListOfCustomVocabularyInformations()

// Delete a custom vocabulary
await client.deleteCustomVocabulary(customVocabularySubmission.id)

For Rev AI Node SDK Developers

After cloning and installing required npm modules, you should follow these practices when developing:

  1. Use the scripts defined in package.json in this manner npm run [command_name]:
    1. lint checks that you are not violating any code style standards. This ensures our code's style quality stays high improving readability and reducing room for errors.
    2. build transpiles the Typescript into Javascript with the options specified in tsconfig.json
    3. unit-test runs our unit tests which live in the unit test directory.
    • Note that integration-test is currently configured to work with a certain account specified in our continuous integration build environment, as such for now you can check the automated continuous integration checks to pass the integration tests.
    1. build-examples performs the same action as build and in addition, copies the src to the node_modules directory in examples such that you can test examples with local changes.
  2. Add any relevant test logic if you add or modify any features in the source code and check that the tests pass using the scripts mentioned above.
  3. Update the examples provided to illustrate any relevant changes you made, and check that they work properly with your changed local revai-node-sdk.
    • One way to use your changed local package in the examples is to copy the output of the build script into the examples/node_modules/revai-node-sdk. On Unix, this can be simply done with the following command when in the root directory: $ cp -r dist/src examples/node_modules/revai-node-sdk/.
  4. Update the documentation to reflect any relevant changes and improve the development section.

revai-node-sdk's People

Contributors

aaron-wilson-rev avatar amikofalvy avatar beaudrychase avatar bwagner avatar celineqiu avatar chanc3 avatar dependabot[bot] avatar eugenep-rev avatar github-actions[bot] avatar hbkse avatar jadesym avatar jennywong2129 avatar k-weng avatar kbridbur avatar kirillatrev avatar kostasrev avatar kshiflett88 avatar lrgottlieb avatar menioa avatar pjhuck avatar seanlam8 avatar timitijani avatar ymardini avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

revai-node-sdk's Issues

File size limited to 10mb

This project uses axios, which limits the body size to 10mb by default (via maxBodyLength). See https://axios-http.com/docs/req_config.

This limit should probably be increased to 2gb to match the API, or at least be configurable. Currently the submitJobLocalFile will throw an error if files larger than 10mb

Add changelog or release notes

Hello,

We're using this library for our RevAI integration and seeing that some updates are being made but there is no public documentation to understand what breaking changes or new features are being added. Is it possible to add the support for this?

Thansk

use in chrome extension

Is it possible to use the streaming api in a chrome extension ?
When I tried, I first had issue with 'fs' module being unavailable, but it doesn't seem necessary for my use case.
After deletion of 'fs' references,
the code block on this call:

        client = new revai.RevAiStreamingClient(token, audioConfig)

Support EU based rev.ai accounts

Issue

EU based accounts require a different base URL (source):
image

Yet, the SDK hardcodes the US based URL:

this.apiHandler = new ApiRequestHandler(`https://api.rev.ai/speechtotext/${version}/`, accessToken);

Workaround

It's possible to overwrite the apiHandler field of the client (by plugging into the internal API):

import { RevAiApiClient } from 'revai-node-sdk';
import { ApiRequestHandler } from 'revai-node-sdk/src/api-request-handler';

...

const client = new RevAiApiClient(accessToken);
client.apiHandler = new ApiRequestHandler('https://ec1.api.rev.ai/speechtotext/v1/', accessToken);

Streaming client: crash after unsafeEnd

Calling RevAiStreamingClient's client.unsafeEnd() often leads to a crash:

Error [ERR_STREAM_PUSH_AFTER_EOF]: stream.push() after EOF
    at readableAddChunk (_stream_readable.js:257:32)
    at PassThrough.Readable.push (_stream_readable.js:224:10)
    at PassThrough.Transform.push (_stream_transform.js:151:32)
    at PassThrough.afterTransform (_stream_transform.js:92:10)
    at PassThrough._transform (_stream_passthrough.js:42:3)
    at PassThrough.Transform._read (_stream_transform.js:190:10)
    at PassThrough.Transform._write (_stream_transform.js:178:12)
    at doWrite (_stream_writable.js:415:12)
    at writeOrBuffer (_stream_writable.js:399:5)
    at PassThrough.Writable.write (_stream_writable.js:299:11)
Emitted 'error' event at:
    at errorOrDestroy (internal/streams/destroy.js:107:12)
    at readableAddChunk (_stream_readable.js:257:9)
    at PassThrough.Readable.push (_stream_readable.js:224:10)
    [... lines matching original stack trace ...]
    at writeOrBuffer (_stream_writable.js:399:5)

What's happening here is that unsafeEnd is calling closeStreams, which executes this.responses.push(null). Once null has been pushed onto a stream, you cannot push to it again. Then, as the connection gets a response from rev.ai, we write to this.response again:

this.responses.write(response as StreamingHypothesis);

Reproduction of this crash: https://codesandbox.io/s/revai-node-sdk-crash-hpy8u?file=/src/index.js (assign a rev.ai API token to the "token" variable). This is the streaming example from the document + a call to unsafeEnd before we've sent/received all of the results.

Something like this should fix it:

diff --git a/src/streaming-client.ts b/src/streaming-client.ts
index 8374e49..f56715e 100644
--- a/src/streaming-client.ts
+++ b/src/streaming-client.ts
@@ -34,6 +34,7 @@ export class RevAiStreamingClient extends EventEmitter {
     private config: AudioConfig;
     private requests: PassThrough;
     private responses: PassThrough;
+    private streamsClosed: boolean;
 
     /**
      * @param accessToken Access token associated with the user's account
@@ -130,6 +131,9 @@ export class RevAiStreamingClient extends EventEmitter {
                 this.closeStreams();
             });
             connection.on('message', (message: any) => {
+                if (this.streamsClosed) {
+                    return;
+                }
                 if (message.type === 'utf8') {
                     let response = JSON.parse(message.utf8Data);
                     if ((response as StreamingResponse).type === 'connected') {
@@ -159,5 +163,6 @@ export class RevAiStreamingClient extends EventEmitter {
     private closeStreams(): void {
         this.requests.end();
         this.responses.push(null);
+        this.streamsClosed = true;
     }
-}
\ No newline at end of file
+}

live stream : 2.6.2 upg to 3x partials are now super slow to come in over sockets

We need to upgrade to 3.0 to get away from the issue around dropped sockets connections with mistaken closure strings in the feed, however when we upgrade to 3x the partials inbound are so slow that it is not possible to use for our captions any-longer.
Are we possibly missing a new configuration flag that would return partials with the expediency of 2.62?
we are using ffmpeg to add a wav audio into a pipe.
our session config is:

const getSessionConfig = (streamID, vocab) => new revai.SessionConfig(
   metadata=streamID,  /* (optional) metadata */
   customVocabularyID=vocab,  /* (optional) custom_vocabulary_id */
   filterProfanity=true,    /* (optional) filter_profanity */
   removeDisfluencies=true, /* (optional) remove_disfluencies */
);

what was millisecond return latency is now over 10 to 20 seconds to get partials using the newest version

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.