GithubHelp home page GithubHelp logo

audioslides / audioslides.io Goto Github PK

View Code? Open in Web Editor NEW
26.0 1.0 3.0 2.23 MB

Use Amazon Polly, Google Slides and FFMpeg to create videos that can be updated at anytime by anyone. This project is written in Elixir.

Home Page: https://audioslides.io/

License: MIT License

Elixir 75.43% JavaScript 2.79% CSS 3.41% HTML 16.34% Shell 1.49% Dockerfile 0.54%
elixir elixir-phoenix elixir-lang amazon-polly google-slides video speech-synthesis polly-voice ffmpeg

audioslides.io's Introduction

AudioSlides.IO

Coverage Status Build Status

Articles

tl;dr

Generate small videos with spoken text from Google Slides.

Using Amazon Polly, Google Slides and FFMpeg to create videos that can be updated at anytime by anyone. This project is written in Elixir.

The Prototype

For our prototype we decided to give Amazon Polly a try. It has a good and simple HTTP-API that allows you to convert text to speech really easily.

For the visual layer we just used Google Slides because they also provide a really good REST-API that allows you to easily export PNG of a slide. It’s also possible to get the speaker notes via the same API that could be the input for the Amazon Polly transformation.

The last step is to combine the generated voice output with the exported png image and produce a small video sequence. For this we just used a handy command line interface called FFMPEG. So the basic processing would look something like this:

Video Generation Process

Example Input & Output

As shown before we need a Google Presentation to start from. My input will be a short slide deck about the new release of Angular version 5.

Google Slides as Input

Angular 5 explained by AudioSlides

Generated Video as Output

Angular 5 explained by AudioSlides

How to start the project

To start your Phoenix server:

  • Install dependencies with mix deps.get
  • Create and migrate your database with mix ecto.create && mix ecto.migrate
  • Install Node.js dependencies with cd assets && npm install
  • Start Phoenix endpoint with mix s

Now you can visit localhost:4000 from your browser.

Use with docker

Build the container

docker build -t audioslides .

Run via docker compose

Init the database

docker-compose run web mix ecto.setup

Run database + project

docker compose up

How to test

Run all tests

mix t

Run all test with integration test(ffmpeg, write files)

mix test.integration

audioslides.io's People

Contributors

robinboehm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

audioslides.io's Issues

feat(aws/polly): handle amazon errors

(CaseClauseError) no case clause matching: 
{:ok, %HTTPoison.Response{body: "{\"message\":\"The security token included in the request is invalid.\"}", headers: [{"x-amzn-RequestId", "7b046fac-cd2d-11e7-b2af-914568816213"}, {"x-amzn-ErrorType", "UnrecognizedClientException:http://internal.amazon.com/coral/com.amazon.coral.service/"}, ...
...
...
}
    (platform) lib/platform/speech/aws/polly.ex:137: Platform.Speech.AWS.Polly.get_binary_speech/2

Response is :ok but there is an x-amzn-ErrorType in this response.

Maybe we should create two functions?

get_binary_speech/2
get_binary_speech!/2

feat(slide-editor): add a speak-preview widget with shortcuts

Context: Create or edit text for a slide. Regenerate whole video takes too long.
As content-creator we want to:

  • Play the whole text of the current slide as preview
  • Play the selected sentence of the current slide as preview
  • Use keyboard shortcuts to trigger these functions

feat(slide-editor): create a basic slide editor

Add a component that enables to edit speaker notes inside AudioSlides that will be synced to the google API.

  • Bidirectional sync with google API
  • Text-Input to edit speaker-notes
  • Direct link to Google Slide (URL + #slide_id=%%)
  • Button-Group with shortcuts for common tags (pause, prosody, .. )
  • Add syntax highlighting

feat(video-generation): update status in UI via websocket

Track current state of the image-, audio- and video-generation via websocket in the web-ui.

let exampleTypes = `
     NEEDS_UPDATE -> UPDATING -> UP_TO_DATE
  `;

let exampleState = {
  lesson_id: 123,
  video_state: "NEEDS_UPDATE",
  slides: [
    {
      slide_id: 123,
      video_state: "UP_TO_DATE",
      audio_state: "UP_TO_DATE",
      image_state: "UP_TO_DATE",
    }
  ]
}
  • Video Module should return a stream that updates on every generation
  • Connect Stream to lesson_controller/generate_video and push to socket
  • Add processing icons to view layer

feat(video): add a state UI for generated content

Following states should be visible for the user:

  • Last sync with Google Presentation at %%
  • Generated Slide-Audio is up-to-date
  • Generated Slide-Image is up-to-date
  • Generated Slide-Video is up-to-date
  • Generated Lesson-Video is up-to-date
  • The duration of a slide-video

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.