GithubHelp home page GithubHelp logo

libraryofcongress / speech-to-text-viewer Goto Github PK

View Code? Open in Web Editor NEW
17.0 6.0 4.0 277 KB

AWS Transcribe evaluation pipeline: bulk-process audio files and view the results

Home Page: https://speech-to-text.labs.loc.gov

License: Other

Makefile 1.54% CSS 6.37% JavaScript 36.39% Python 38.20% HTML 17.50%
aws aws-transcribe asr speech-to-text

speech-to-text-viewer's Introduction

Speech-to-Text Result Viewer

This is a little tooling around AWS Transcribe to allow us to evaluate the service quality.

See https://speech-to-text.labs.loc.gov/ for the current public release.

Getting Started

  1. Have Python 3.7 and Pipenv installed

  2. Have your environment configured with the credentials for the AWS account which you intend to use. If you are using multiple accounts, either set AWS_PROFILE or use a tool such as aws-vault to prefix the transcription and download commands.

  3. pipenv install --python 3.7

  4. Prepare a tab-separated manifest file with the following fields in order:

    • identifier
    • language
    • Title
    • Page to view more information about the file (this will be the more information link)
    • High-quality original master URL (if the URL starts with s3:// it will be passed in directly with no checks; otherwise it will be uploaded to the specified S3 bucket)
    • Streamable audio URL (this will be used by the embedded player)

    Here's an example manifest entry which will be uploaded to S3 before processing:

    afc1941004_sr01    english    "Man-on-the-Street," Washington, D.C., December 8, 1941    https://www.loc.gov/item/afc1941004_sr01/    http://cdn.loc.gov/master/afc/afc1941004/afc1941004_sr01a/afc1941004_sr01a.wav    http://cdn.loc.gov/service/afc/afc1941004/afc1941004_sr01a/afc1941004_sr01a.mp3

    Here's an example manifest entry using a pre-existing S3 object which will be passed directly to Transcribe:

    afc1941004_sr01a	english	"Man-on-the-Street," Washington, D.C., December 8, 1941	https://www.loc.gov/item/afc1941004_sr01/	s3://my-source-bucket/afc/afc1941004/afc1941004_sr01a/afc1941004_sr01a.mp3	https://cdn.loc.gov/service/afc/afc1941004/afc1941004_sr01a/afc1941004_sr01a.mp3
  5. Submit the items for transcription. Plese note that this is the point where you will incur charges for the service.

    $ pipenv run python transcribe-items.py my-items.tsv
    Uploading afc1941004_sr01 “"Man-on-the-Street," Washington, D.C., December 8, 1941” to …
    Transcribing afc1941004_sr01 from …
    …
  6. Type make to download the results, which may take a number of minutes to become available. The process is repeatable and will not reprocess transcriptions which have already been downloaded.

  7. Once at least a single item has been downloaded, you can load the viewer from the local directory (e.g. pipenv run python -m http.server)

  8. Uploading to a remote server is as simple as uploading contents of this working directory. make upload will do this once you change the target bucket name for the S3 sync command.

speech-to-text-viewer's People

Contributors

acdha avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.