GithubHelp home page GithubHelp logo

man0bhir / google-ai-video-transcribe-subtitle-generator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from groundcat/google-ai-video-transcribe-subtitle-generator

0.0 0.0 0.0 9 KB

Transcribes video using GCP speech-to-text and generates .SRT subtitles

License: MIT License

Python 100.00%

google-ai-video-transcribe-subtitle-generator's Introduction

Google Cloud speech-to-text video transcribe subtitle generator

This script converts a video file to an audio file, transcribe the audio file with Google Cloud Platform speech-to-text API, and generates the result into .SRT, .JSON, .TXT file formats.

Requirements

  • Git, Python 3.7 and ffmpeg installed on your system.

  • A Google Cloud project with billing enabled.

  • A service account with the right to use Speech-to-Text API.

  • Download the service account credentials as credentials.json. Example:

    {
    "type": "service_account",
    "project_id": "EXAMPLE",
    "private_key_id": "EXAMPLE",
    "private_key": "EXAMPLE",
    "client_email": "EXAMPLE",
    "client_id": "EXAMPLE",
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://oauth2.googleapis.com/token",
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/EXAMPLE"
    }
    
  • Install the requirements:

    pip3 install -r requirements.txt
    
  • Confugure the .env file. Example:

    # Google storage bucket name
    BUCKET_NAME = "bucket_name"
    
    # Maximum characters in each single line in the SRT subtitle file
    MAX_CHARS = 60
    
    # Location where your ffmpeg binary file is put
    FFMPEG_LOCATION = "C:\\\Apps\\ffmpeg\\bin\\ffmpeg.exe"
    FFPROBE_LOCATION = "C:\\\Apps\\ffmpeg\\bin\\ffprobe.exe"
    

Usage

python3 main.py example.mp4 en-US

Explanation of functions

upload_blob() - Uploads the media file to the Google Storaege bucket.

video_info() - Returns number of channels, bit rate, and sample rate of the video, extracted by running ffmpeg. These parameters are required by Google's API.

video_to_audio() - Converts video into audio, and upload the audio to the Google Storaege bucket.

long_running_recognize() - Transcribes the audio by calling Google Cloud API.

break_sentences() - Breaks sentences by punctuations and maximum sentence length. This ensures that in the video subtitle, the sentence won't be too long.

TODO

  • Handle non-English languages results

Reference

google-ai-video-transcribe-subtitle-generator's People

Contributors

groundcat avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.