GithubHelp home page GithubHelp logo

bbc / subtitles-generator Goto Github PK

View Code? Open in Web Editor NEW
47.0 21.0 6.0 633 KB

A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs

JavaScript 100.00%
newslabs news-labs subtitles stt srt ttml vtt itt json premiere

subtitles-generator's Introduction

Subtitles Generator - draft

A node module to generate subtitles by segmenting a list of time-coded text.

Exports to

  • TTML for Premiere as .xml
  • TTML
  • iTT - for Apple
  • srt
  • vtt
  • csv
  • txt - pre-segmented text

It can also provide pre-segmented lines if the input is plain text.

Setup

git clone, cd into folder, npm install

Usage

const subtitlesComposer = require('./src/index.js');
// const sampleWords = // some word json 
const subtitlesJson = subtitlesComposer({words: sampleWords, type: 'json'})
const ttmlPremiere = subtitlesComposer({words: sampleWords, type: 'premiere'})
const ittData = subtitlesComposer({words: sampleWords, type: 'itt'})
const ttmlData = subtitlesComposer({words: sampleWords, type: 'ttml'})
const srtData = subtitlesComposer({words: sampleWords, type: 'srt'})
const vttData = subtitlesComposer({words: sampleWords, type: 'vtt'})

see example-usage.js for more comprehensive example.

To try locally

npx babel-node example-usage.js

words Input

  • either an array list of words objects
    example
const sampleWords =[ 
      {
        "id": 0,
        "start": 13.02,
        "end": 13.17,
        "text": "There"
      },
      {
        "id": 1,
        "start": 13.17,
        "end": 13.38,
        "text": "is"
      },
      {
        "id": 2,
        "start": 13.38,
        "end": 13.44,
        "text": "a"
      },
      {
        "id": 3,
        "start": 13.44,
        "end": 13.86,
        "text": "day."
      },
...
  • or a string of text
    Example
const sampleWords = "There is a day. ..."

If input words is plain text only (and not a list of words with timecodes) then can only use pre-segment-txt option. (see test-presegment.txt for example)

Output:

see example-output folder for examples.

System Architecture

In pseudo code, at a high level

// expecting array list of words OR plain text string

  // if array list of words, convert text into string

  // presegment the text 
     using pre segmentation algorithm to break into line of x char - default 35

// generate subtitles 
   use subtitles generators for various format to convert presegemented json into subtitles

// return trsult

Segmentation algorithm refactored from pietrop/subtitlesComposer originally by @polizoto. And subtitles generation in various originally format by @laurian and @maboaas part of BBC Subtitlelizer project.

Development env

Node version is set in node version manager .nvmrc

Build

npm run build

uses babel-cli to transpile ES6 into the ./build folder.

Tests

npm test

To run tests during development

npm run test:watch

Linting

To run linter

npm run lint

To run and fix

npm run lint:fix

Deployment

coming soon, deploying to npm registry as @bbc/subtitles-composer

npm run publish:public

TODO

  • Open source
  • use import/export in modules
  • add babel

subtitles-generator's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

subtitles-generator's Issues

License?

May developers modify and redistribute this project under an open source license?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.