GithubHelp home page GithubHelp logo

steren / openai-whisper-api Goto Github PK

View Code? Open in Web Editor NEW

This project forked from illyism/openai-whisper-api

0.0 1.0 0.0 21 KB

OpenAI Whisper API based on Node.js / Bun.sh in a Docker Container + Google Cloud Run Example

Home Page: https://magicbuddy.chat/openai-whisper

JavaScript 8.39% TypeScript 88.54% Dockerfile 3.08%

openai-whisper-api's Introduction

OpenAI Whisper API

An Open Source Solution for Speech-to-Text and More

Welcome to the OpenAI Whisper API, an open-source AI model microservice that leverages the power of OpenAI's whisper api, a state-of-the-art automatic speech recognition (ASR) system as a large language model. This service, built with Node.js, Bun.sh, and Typescript, is designed to run on Docker with zero dependencies, making it a versatile tool for developers across various speech and language-related applications.

The Whisper API is a speech-to-text model trained on a vast amount of multilingual and multitask training data, including a wide range of audio files and audio recordings. It's a single model that can handle tasks such as language identification, speech translation, and of course, transforming spoken word into written text.

The model is capable of handling a sequence of tokens and can work with natural language, making it a powerful tool for machine learning applications. It's designed to handle multilingual speech recognition, and it can even manage background noise, making it useful for transcribing a video call, zoom calls, a YouTube video or non-chat use cases in English language and more with full control.

The API is simple and is designed to be easy to use for developers of all skill levels with simple developer access. It's an open-source project, and it's licensed under the MIT license, meaning you can use it in your own projects with few restrictions. Whether you're looking to transcribe voice messages, improve system performance through a series of system-wide optimizations, or explore the capabilities of the OpenAI Whisper API, this is the place to start. Dive into the following code to learn more about how to use this powerful tool as a first step and get your OpenAI Account with a new api key.

Usage

This is is a OpenAI Whisper API microservice using Node.js / Bun.sh / Typescript that can run on Docker. With zero dependencies. It listens to the /transcribe route for MP3 files and returns the text transcription.

Running locally

Install bun.sh first, clone this directory and run these commands:

bun install
bun run dev

You can now navigate to http://localhost:3000 or the PORT provided, see the Usage section below.

Docker

Google Cloud Run Deployment

Install bun.sh first, clone this directory and run these commands: Change the project ID to your own.

docker build --platform linux/amd64 -t gcr.io/magicbuddy-chat/whisper-docker .
docker push gcr.io/magicbuddy-chat/whisper-docker

gcloud run deploy whisper-docker \
  --image gcr.io/magicbuddy-chat/whisper-docker  \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --project magicbuddy-chat

You should receive a Service URL, see the Usage section below.

Usage

You can test normal HTTP by opening the /ping endpoint on the URL.

Connect to the /transcribe and send a POST request with the following body:

{
  "audio": "BASE64_ENCODED_AUDIO"
}

API Key

You need to pass the OpenAI API Key as a HEADER:

Authorization: Bearer OPENAI_KEY

Or you can launch the docker image or server with OPENAI_KEY in the env:

OPENAI_KEY=YOUR_KEY_HERE bun run dev

# or

docker run -p 3000:3000 -e OPENAI_KEY=YOUR_KEY_HERE gcr.io/magicbuddy-chat/whisper-docker

# or set it as env in Cloud Run with the below command or in the Cloud Console UI

gcloud run deploy whisper-docker \
  --image gcr.io/magicbuddy-chat/whisper-docker  \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --project magicbuddy-chat \
  --set-env-vars OPENAI_KEY=YOUR_KEY_HERE

Live example

We are using this Whisper API with MagicBuddy, a Telegram ChatGPT bot.

You can use the OpenAI Whisper Docker as a live example here:

openai-whisper-api's People

Contributors

illyism avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.