GithubHelp home page GithubHelp logo

gerhobbelt / openfst-utils Goto Github PK

View Code? Open in Web Editor NEW

This project forked from benob/openfst-utils

0.0 1.0 0.0 35 KB

Utilities for manipulating finite state transducers with the OpenFst library.

C++ 99.42% Makefile 0.58%

openfst-utils's Introduction

Utilities for OpenFst
---------------------
Benoit Favre <[email protected]>
released under Apache License Version 2.0

This is a set of useful programs for manipulating Finite State Transducer with the OpenFst library.

* fstcompile-nolex [-t]: compile an acceptor [transducer], generate symbol lexicons on the fly and save them with the fst. Useful for quick hacks on a single fst.

* fstcompose-maplex <fst1> <fst2>: compose after mapping symbol tables (for use with fstcompile-nolex which generates different symbol tables) fst1 or fst2 can be "" (empty string) which means stdin.

* fstminimize-transducer: encode input/output, rmepsilon, determinize, minimize and decode in one pass.

* fstdeterminize-tc-lex: keep the best output for each input sequence in a transducer using determinization in the (Tropical, Categorial)-Lexicographic semiring. See "Efficient Determinization of Tagged Word Lattices using Categorial and Lexicographic Semirings", by Izhak Shafran et al, ASRU 2011.

* fstsuperfinal-noepsilon: add a superfinal state without adding epsilon arcs

* fstoracle <fst1> <fst2>: compute the shortest distance alignment between two transducers

* fstcompose-specials <fst1> <fst2>: compose two transducers using special <phi>, <rho> and <sigma> transitions. <sigma> can replace any input symbol; <rho> is like sigma but only if no other path can be followed; <phi> is an epsilon transition which can be followed if no other transition matches an input symbol. Note that lexicons from the two fsts are mapped.

fstprint sentence.fst:
0   1   the
1   2   boy
2   3   is
3   4   here
4

fstprint model.fst:
0   0   <rho> <eps>
0   1   boy XXX
1   2   is YYY
2   2   <rho> <eps>
2

./fstcompose-specials model.fst < sentence.fst | fstprint
0   1   the <eps>
1   2   boy XXX
2   3   is YYY
3   4   here <eps>
4

* fstposteriors: compute arc-level posterior probabilities from an automaton where weights are -log probs in the tropical semiring.

* fstprint-nbest-strings <n>: compute nbest and print string of input/output symbols (or input if input == output)

--- less useful / non-working stuff ---

* add-tags <dict>: generate a tagging transducer from a word acceptor according to a tab-separated, one-word-per-line tag dictionary

* ngram-expand: expand transducer so that each arc has a fixed context of up to n-1 arcs (so called ngramize) -- non functional at this time

openfst-utils's People

Contributors

benob avatar frankier avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.