GithubHelp home page GithubHelp logo

timestamp-extraction's Introduction

timestamp-extraction

Command line python tool to extract timestamps embedded on video frames and convert them to text.

A wide range of applications, especially in industries that deal with broadcasting, videos are embedded with a timestamp at the time of archival. If the videos are stored as segments, as in the case of HLS, it becomes a cumbersome manual procedure to index these videos at a later time.

It would be more convenient if one could feed in the coordinates and dimensions of the timestamp on the videos, and a tool could provide the timestamp at the first frame of the video. This is exactly what the timestamp-extraction tool does.

How does it work?

The first step: Pre-processing video frames

Before the program tries to detect the characters in the timestamp, it is to be ensured that the timestamp is crealy visible and is neatly isolated from the image. Although it can be isolated using its coordinates in the frame and its dimesions, it is more challenging to isolate it from it's neighbourhood. In some frames, it may never be possible to isolate the timestamp from its background (For instance, when the neighbourhood is of the same colour as the timestamp). This can be achieved using a series of image processing techniques. Some of the techniques used are: median filtering, resizing, and geormetric transformations like opening and closing. At the end of this step, the a clear image of the timestamp is available for extraction to text.

Text recognition in images using tesseract

Tesseract is an open source google tool for developers that helps recognize text embedded in images. When used on the isolated timestamp frame, tesseract can be used to recognize all text data in the image.

Verifying that extracted timestamp is correct

More often than not, the text generated may vary from the timestamp on some digits. A regex is used to verify if the text recognized is in the form of a timestamp. Once the format has been verified, the text is verified by cheking against a timestamp value that is extracted from the same video source at a later instance of time. The program terminates if a timestamp that is in agreement with the timestamp detected at the first frame is recognized.

The program lacks a good command line or graphical interface.

Developers are welcome to contribute to this project, in terms of such improvements.

timestamp-extraction's People

Contributors

akshay-krishnan avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

anushaballawala

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.