GithubHelp home page GithubHelp logo

madclumsil33t / fstalign Goto Github PK

View Code? Open in Web Editor NEW

This project forked from revdotcom/fstalign

0.0 1.0 0.0 202 KB

An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.

License: Apache License 2.0

Shell 1.09% C++ 95.49% Perl 1.74% C 0.05% CMake 1.26% Dockerfile 0.37%

fstalign's Introduction

CI License

fstalign

Overview

fstalign is a tool for creating alignment between two sequences of tokens (here out referred to as “reference” and “hypothesis”). It has two key functions: computing word error rate (WER) and aligning NLP-formatted references with CTM hypotheses.

Due to its use of OpenFST and lazy algorithms for text-based alignment, fstalign is efficient for calculating WER while also providing significant flexibility for different measurement features and error analysis.

Installation

Dependencies

We use git submodules to manage third-party dependencies. Initialize and update submodules before proceeding to the main build steps.

git submodule update --init --recursive

This will pull the current dependencies:

  • catch2 - for unit testing
  • spdlog - for logging
  • CLI11 - for CLI construction
  • csv - for CTM and NLP input parsing
  • jsoncpp - for JSON output construction
  • strtk - for various string utilities

Additionally, we have dependencies outside of the third-party submodules:

  • OpenFST - currently provided to the build system by settings the $OPENFST_ROOT environment variable or during the CMake command via -DOPENFST_ROOT.

Build

The current build framework is CMake. Install CMake following the instructions here (https://cmake.org/install/).

To build fstalign, run:

    mkdir build && cd build
    cmake .. -DOPENFST_ROOT="<path to OpenFST>" -DDYNAMIC_OPENFST=ON
    make

Note: -DDYNAMIC_OPENFST=ON is needed if OpenFST at OPENFST_ROOT is compiled as shared libraries. Otherwise static libraries are assumed.

Finally, tests can be run using:

make test

Docker

The fstalign docker image is hosted on Docker Hub and can be easily pulled and run:

docker pull revdotcom/fstalign
docker run --rm -it revdotcom/fstalign

See https://hub.docker.com/r/revdotcom/fstalign/tags for the available versions/tags to pull. If you desire to run the tool on local files you can mount local directories with the -v flag of the docker run command.

From inside the container:

/fstalign/build/fstalign --help

For development you can also build the docker image locally using:

docker build . -t fstalign-dev

Quickstart

Rev FST Align
Usage: ./fstalign [OPTIONS] [SUBCOMMAND]

Options:
  -h,--help                   Print this help message and exit
  --help-all                  Expand all help
  --version                   Show fstalign version.

Subcommands:
  wer                         Get the WER between a reference and an hypothesis.
  align                       Produce an alignment between an NLP file and a CTM-like input.

WER Subcommand

The wer subcommand is the most frequent usage of this tool. Required are two arguments traditional to WER calculation: a reference (--ref <file_path>) and a hypothesis (--hyp <file_path>) transcript. Currently the tool is configured to simply look at the file extension to determine the file format of the input transcripts and parse accordingly.

File Extension Reference Support Hypothesis Supprt
.ctm
.nlp
.fst
All other file extensions, assumed to be plain text

Basic Example:

ref.txt
this is the best sentence

hyp.txt
this is a test sentence

./bin/fstalign wer --ref ref.txt --hyp hyp.txt

When run, fstalign will dump a log to STDOUT with summary WER information at the bottom. For the above example:

[+++] [20:37:10] [fstalign] done walking the graph
[+++] [20:37:10] [wer] best WER: 2/5 = 0.4000 (Total words in reference: 5)
[+++] [20:37:10] [wer] best WER: INS:0 DEL:0 SUB:2
[+++] [20:37:10] [wer] best WER: Precision:0.600000 Recall:0.600000

Note that in addition to general WER, the insertion/deletion/substitution breakdown is also printed. fstalign also has other useful outputs, including a JSON log for downstream machine parsing, and a side-by-side view of the alignment and errors generated. For more details, see the Outputs section in the Advanced Usage doc.

Align Subcommand

Usage of the align subcommand is almost identical to the wer subcommand. The exception is that align can only be run if the provided reference is a NLP and the provided hypothesis is a CTM. This is because the core function of the subcommand is to align an NLP without timestamps to a CTM that has timestamps, producing an output of tokens from the reference with timings from the hypothesis.

Advanced Usage

See the advanced usage doc for more details.

fstalign's People

Contributors

qmac avatar nishchalb avatar ajhinsvark avatar dkokotov avatar jdongian avatar pique0822 avatar jprobichaud avatar coreymillerrev avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.