GithubHelp home page GithubHelp logo

supershaneski / openai-whisper-api Goto Github PK

View Code? Open in Web Editor NEW
62.0 3.0 20.0 666 KB

A sample speech transcription app implementing OpenAI Text to Speech API based on Whisper, an automatic speech recognition (ASR) system, built using Next 13, the React framework

License: MIT License

JavaScript 82.19% CSS 17.81%
openai next nextjs openai-api openai-whisper speech-to-text whisper whisper-api next13 react reactjs

openai-whisper-api's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

openai-whisper-api's Issues

Invalid file format on mobile devices

It seems to work on my desktop PC and android devices but as soon as I try it on a iOS devices, the server gets this error from OpenAI:

{
  error: {
    message: "Invalid file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']",
    type: 'invalid_request_error',
    param: null,
    code: null
  }
}

The file is created successfully in the directory and I can play it with a media player.

[Error] Failed to load resource: the server responded with a status of 500 (Internal Server Error) (api, line 0)

After I record with this web application, I get this error in the console ( I made sure my API key was the right one )

[Log] NotSupportedError: mimeType is not supported
MediaRecorder
handleStream

[Log] handle stream2...
[Warning] The resource http://localhost:3005/_next/static/chunks/polyfills.js was preloaded using link preload but not used within a few seconds from the window's load event. Please make sure it wasn't preloaded for nothing.
[Log] NotSupportedError: mimeType is not supported
[Log] handle stream2...
[Log] [send data] โ€“ "3:31:14 PM"
[Error] Failed to load resource: the server responded with a status of 500 (Internal Server Error) (api, line 0)
[Log] SyntaxError: The string did not match the expected pattern.
(anonymous function)

503 issue with implementation on Heroku

Your project is wonderful, and I was able to run the project on local host. When I moved to Heroku, I received 503, service unavailable error, when mainPage.js attempts const response = away (fetch(url.... I see console.log [send data], but [received data] is not reached.

I thought the issue might be related to .pem for SSL, but I understand that Heroku doesn't require that element. Any guidance would be greatly appreciated.

The speed is so slow

I have tried your nice project. But I find it cost a very long to get the result. I have used Azure speech recognizer, it is about 4-5 to get the result. But I think whisper cost about 10 seconds. It is very hard to use it in the production.

Audio files not created or not reachable on Vercel?

Hello,
Thank you for the hard work and effort you put in sharing this with excellent documentation.
I have successfully managed to install and run it with API on localhost.
But i would like to deploy it to Vercel (which most NextJs developers will want to use), and while the build successfully deploys and the page loads, it returns an error when i try to transcribe:

- error Error: ENOENT: no such file or directory, open 'public/uploads/file168369759598641389.webm'

I suppose this is related to how NextJs deals with the public folder path (as a source for copying files during build, not as an actual subfolder, so theoretically the correct path for the file on a live server should be 'uploads/file168369759598641389.webm' - i think!

Or is this relevant: https://vercel.com/guides/how-to-upload-and-store-files-with-vercel ? Perhaps it's something else and the files should be uploaded to a CDN or S3? that would be an amazing feature to have.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.