pykaldi / pykaldi Goto Github PK

A Python wrapper for Kaldi

License: Apache License 2.0

CMake 4.45% Python 63.21% Shell 2.66% C++ 29.16% C 0.01% Dockerfile 0.10% Vim Script 0.41%

python wrapper kaldi openfst asr speech-recognition speech language-model feature-extraction clif

pykaldi's Introduction

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. It provides easy-to-use, low-overhead, first-class Python wrappers for the C++ code in Kaldi and OpenFst libraries. You can use PyKaldi to write Python code for things that would otherwise require writing C++ code such as calling low-level Kaldi functions, manipulating Kaldi and OpenFst objects in code or implementing new Kaldi tools.

You can think of Kaldi as a large box of legos that you can mix and match to build custom speech recognition solutions. The best way to think of PyKaldi is as a supplement, a sidekick if you will, to Kaldi. In fact, PyKaldi is at its best when it is used alongside Kaldi. To that end, replicating the functionality of myriad command-line tools, utility scripts and shell-level recipes provided by Kaldi is a non-goal for the PyKaldi project.

Getting Started

Like Kaldi, PyKaldi is primarily intended for speech recognition researchers and professionals. It is jam packed with goodies that one would need to build Python software taking advantage of the vast collection of utilities, algorithms and data structures provided by Kaldi and OpenFst libraries.

If you are not familiar with FST-based speech recognition or have no interest in having access to the guts of Kaldi and OpenFst in Python, but only want to run a pre-trained Kaldi system as part of your Python application, do not fret. PyKaldi includes a number of high-level application oriented modules, such as asr, alignment and segmentation, that should be accessible to most Python programmers.

If you are interested in using PyKaldi for research or building advanced ASR applications, you are in luck. PyKaldi comes with everything you need to read, write, inspect, manipulate or visualize Kaldi and OpenFst objects in Python. It includes Python wrappers for most functions and methods that are part of the public APIs of Kaldi and OpenFst C++ libraries. If you want to read/write files that are produced/consumed by Kaldi tools, check out I/O and table utilities in the util package. If you want to work with Kaldi matrices and vectors, e.g. convert them to NumPy ndarrays and vice versa, check out the matrix package. If you want to use Kaldi for feature extraction and transformation, check out the feat, ivector and transform packages. If you want to work with lattices or other FST structures produced/consumed by Kaldi tools, check out the fstext, lat and kws packages. If you want low-level access to Gaussian mixture models, hidden Markov models or phonetic decision trees in Kaldi, check out the gmm, sgmm2, hmm, and tree packages. If you want low-level access to Kaldi neural network models, check out the nnet3, cudamatrix and chain packages. If you want to use the decoders and language modeling utilities in Kaldi, check out the decoder, lm, rnnlm, tfrnnlm and online2 packages.

Interested readers who would like to learn more about Kaldi and PyKaldi might find the following resources useful:

Kaldi Docs: Read these to learn more about Kaldi.
PyKaldi Docs: Consult these to learn more about the PyKaldi API.
PyKaldi Examples: Check these out to see PyKaldi in action.
PyKaldi Paper: Read this to learn more about the design of PyKaldi.

Since automatic speech recognition (ASR) in Python is undoubtedly the "killer app" for PyKaldi, we will go over a few ASR scenarios to get a feel for the PyKaldi API. We should note that PyKaldi does not provide any high-level utilities for training ASR models, so you need to train your models using Kaldi recipes or use pre-trained models available online. The reason why this is so is simply because there is no high-level ASR training API in Kaldi C++ libraries. Kaldi ASR models are trained using complex shell-level recipes that handle everything from data preparation to the orchestration of myriad Kaldi executables used in training. This is by design and unlikely to change in the future. PyKaldi does provide wrappers for the low-level ASR training utilities in Kaldi C++ libraries but those are not really useful unless you want to build an ASR training pipeline in Python from basic building blocks, which is no easy task. Continuing with the lego analogy, this task is akin to building this given access to a truck full of legos you might need. If you are crazy enough to try though, please don't let this paragraph discourage you. Before we started building PyKaldi, we thought that was a mad man's task too.

Automatic Speech Recognition in Python

PyKaldi asr module includes a number of easy-to-use, high-level classes to make it dead simple to put together ASR systems in Python. Ignoring the boilerplate code needed for setting things up, doing ASR with PyKaldi can be as simple as the following snippet of code:

asr = SomeRecognizer.from_files("final.mdl", "HCLG.fst", "words.txt", opts)

with SequentialMatrixReader("ark:feats.ark") as feats_reader:
    for key, feats in feats_reader:
        out = asr.decode(feats)
        print(key, out["text"])

In this simplified example, we first instantiate a hypothetical recognizer SomeRecognizer with the paths for the model final.mdl, the decoding graph HCLG.fst and the symbol table words.txt. The opts object contains the configuration options for the recognizer. Then, we instantiate a PyKaldi table reader SequentialMatrixReader for reading the feature matrices stored in the Kaldi archive feats.ark. Finally, we iterate over the feature matrices and decode them one by one. Here we are simply printing the best ASR hypothesis for each utterance so we are only interested in the "text" entry of the output dictionary out. Keep in mind that the output dictionary contains a bunch of other useful entries, such as the frame level alignment of the best hypothesis and a weighted lattice representing the most likely hypotheses. Admittedly, not all ASR pipelines will be as simple as this example, but they will often have the same overall structure. In the following sections, we will see how we can adapt the code given above to implement more complicated ASR pipelines.

Offline ASR using Kaldi Models

This is the most common scenario. We want to do offline ASR using pre-trained Kaldi models, such as ASpIRE chain models. Here we are using the term "models" loosely to refer to everything one would need to put together an ASR system. In this specific example, we are going to need:

a neural network acoustic model,
a transition model,
a decoding graph,
a word symbol table,
and a couple of feature extraction configs.

Note that you can use this example code to decode with ASpIRE chain models.

from kaldi.asr import NnetLatticeFasterRecognizer
from kaldi.decoder import LatticeFasterDecoderOptions
from kaldi.nnet3 import NnetSimpleComputationOptions
from kaldi.util.table import SequentialMatrixReader, CompactLatticeWriter

# Set the paths and read/write specifiers
model_path = "models/aspire/final.mdl"
graph_path = "models/aspire/graph_pp/HCLG.fst"
symbols_path = "models/aspire/graph_pp/words.txt"
feats_rspec = ("ark:compute-mfcc-feats --config=models/aspire/conf/mfcc.conf "
               "scp:wav.scp ark:- |")
ivectors_rspec = (feats_rspec + "ivector-extract-online2 "
                  "--config=models/aspire/conf/ivector_extractor.conf "
                  "ark:spk2utt ark:- ark:- |")
lat_wspec = "ark:| gzip -c > lat.gz"

# Instantiate the recognizer
decoder_opts = LatticeFasterDecoderOptions()
decoder_opts.beam = 13
decoder_opts.max_active = 7000
decodable_opts = NnetSimpleComputationOptions()
decodable_opts.acoustic_scale = 1.0
decodable_opts.frame_subsampling_factor = 3
asr = NnetLatticeFasterRecognizer.from_files(
    model_path, graph_path, symbols_path,
    decoder_opts=decoder_opts, decodable_opts=decodable_opts)

# Extract the features, decode and write output lattices
with SequentialMatrixReader(feats_rspec) as feats_reader, \
     SequentialMatrixReader(ivectors_rspec) as ivectors_reader, \
     CompactLatticeWriter(lat_wspec) as lat_writer:
    for (fkey, feats), (ikey, ivectors) in zip(feats_reader, ivectors_reader):
        assert(fkey == ikey)
        out = asr.decode((feats, ivectors))
        print(fkey, out["text"])
        lat_writer[fkey] = out["lattice"]

The fundamental difference between this example and the short snippet from last section is that for each utterance we are reading the raw audio data from disk and computing two feature matrices on the fly instead of reading a single precomputed feature matrix from disk. The script file wav.scp contains a list of WAV files corresponding to the utterances we want to decode. The additional feature matrix we are extracting contains online i-vectors that are used by the neural network acoustic model to perform channel and speaker adaptation. The speaker-to-utterance map spk2utt is used for accumulating separate statistics for each speaker in online i-vector extraction. It can be a simple identity mapping if the speaker information is not available. We pack the MFCC features and the i-vectors into a tuple and pass this tuple to the recognizer for decoding. The neural network recognizers in PyKaldi know how to handle the additional i-vector features when they are available. The model file final.mdl contains both the transition model and the neural network acoustic model. The NnetLatticeFasterRecognizer processes feature matrices by first computing phone log-likelihoods using the neural network acoustic model, then mapping those to transition log-likelihoods using the transition model and finally decoding transition log-likelihoods into word sequences using the decoding graph HCLG.fst, which has transition IDs on its input labels and word IDs on its output labels. After decoding, we save the lattice generated by the recognizer to a Kaldi archive for future processing.

This example also illustrates the powerful I/O mechanisms provided by Kaldi. Instead of implementing the feature extraction pipelines in code, we define them as Kaldi read specifiers and compute the feature matrices simply by instantiating PyKaldi table readers and iterating over them. This is not only the simplest but also the fastest way of computing features with PyKaldi since the feature extraction pipeline is run in parallel by the operating system. Similarly, we use a Kaldi write specifier to instantiate a PyKaldi table writer which writes output lattices to a compressed Kaldi archive. Note that for these to work, we need compute-mfcc-feats, ivector-extract-online2 and gzip to be on our PATH.

Offline ASR using a PyTorch Acoustic Model

This is similar to the previous scenario, but instead of a Kaldi acoustic model, we use a PyTorch acoustic model. After computing the features as before, we convert them to a PyTorch tensor, do the forward pass using a PyTorch neural network module outputting phone log-likelihoods and finally convert those log-likelihoods back into a PyKaldi matrix for decoding. The recognizer uses the transition model to automatically map phone IDs to transition IDs, the input labels on a typical Kaldi decoding graph.

from kaldi.asr import MappedLatticeFasterRecognizer
from kaldi.decoder import LatticeFasterDecoderOptions
from kaldi.matrix import Matrix
from kaldi.util.table import SequentialMatrixReader, CompactLatticeWriter
from models import AcousticModel  # Import your PyTorch model
import torch

# Set the paths and read/write specifiers
acoustic_model_path = "models/aspire/model.pt"
transition_model_path = "models/aspire/final.mdl"
graph_path = "models/aspire/graph_pp/HCLG.fst"
symbols_path = "models/aspire/graph_pp/words.txt"
feats_rspec = ("ark:compute-mfcc-feats --config=models/aspire/conf/mfcc.conf "
               "scp:wav.scp ark:- |")
lat_wspec = "ark:| gzip -c > lat.gz"

# Instantiate the recognizer
decoder_opts = LatticeFasterDecoderOptions()
decoder_opts.beam = 13
decoder_opts.max_active = 7000
asr = MappedLatticeFasterRecognizer.from_files(
    transition_model_path, graph_path, symbols_path, decoder_opts=decoder_opts)

# Instantiate the PyTorch acoustic model (subclass of torch.nn.Module)
model = AcousticModel(...)
model.load_state_dict(torch.load(acoustic_model_path))
model.eval()

# Extract the features, decode and write output lattices
with SequentialMatrixReader(feats_rspec) as feats_reader, \
     CompactLatticeWriter(lat_wspec) as lat_writer:
    for key, feats in feats_reader:
        feats = torch.from_numpy(feats.numpy())  # Convert to PyTorch tensor
        loglikes = model(feats)                  # Compute log-likelihoods
        loglikes = Matrix(loglikes.numpy())      # Convert to PyKaldi matrix
        out = asr.decode(loglikes)
        print(key, out["text"])
        lat_writer[key] = out["lattice"]

Online ASR using Kaldi Models

This section is a placeholder. Check out this script in the meantime.

Lattice Rescoring with a Kaldi RNNLM

Lattice rescoring is a standard technique for using large n-gram language models or recurrent neural network language models (RNNLMs) in ASR. In this example, we rescore lattices using a Kaldi RNNLM. We first instantiate a rescorer by providing the paths for the models. Then we use a table reader to iterate over the lattices we want to rescore and finally we use a table writer to write rescored lattices back to disk.

from kaldi.asr import LatticeRnnlmPrunedRescorer
from kaldi.fstext import SymbolTable
from kaldi.rnnlm import RnnlmComputeStateComputationOptions
from kaldi.util.table import SequentialCompactLatticeReader, CompactLatticeWriter

# Set the paths, extended filenames and read/write specifiers
symbols_path = "models/tedlium/config/words.txt"
old_lm_path = "models/tedlium/data/lang_nosp/G.fst"
word_feats_path = "models/tedlium/word_feats.txt"
feat_embedding_path = "models/tedlium/feat_embedding.final.mat"
word_embedding_rxfilename = ("rnnlm-get-word-embedding %s %s - |"
                             % (word_feats_path, feat_embedding_path))
rnnlm_path = "models/tedlium/final.raw"
lat_rspec = "ark:gunzip -c lat.gz |"
lat_wspec = "ark:| gzip -c > rescored_lat.gz"

# Instantiate the rescorer
symbols = SymbolTable.read_text(symbols_path)
opts = RnnlmComputeStateComputationOptions()
opts.bos_index = symbols.find_index("<s>")
opts.eos_index = symbols.find_index("</s>")
opts.brk_index = symbols.find_index("<brk>")
rescorer = LatticeRnnlmPrunedRescorer.from_files(
    old_lm_path, word_embedding_rxfilename, rnnlm_path, opts=opts)

# Read the lattices, rescore and write output lattices
with SequentialCompactLatticeReader(lat_rspec) as lat_reader, \
     CompactLatticeWriter(lat_wspec) as lat_writer:
  for key, lat in lat_reader:
    lat_writer[key] = rescorer.rescore(lat)

Notice the extended filename we used to compute the word embeddings from the word features and the feature embeddings on the fly. Also of note are the read/write specifiers we used to transparently decompress/compress the lattice archives. For these to work, we need rnnlm-get-word-embedding, gunzip and gzip to be on our PATH.

About PyKaldi

PyKaldi aims to bridge the gap between Kaldi and all the nice things Python has to offer. It is more than a collection of bindings into Kaldi libraries. It is a scripting layer providing first class support for essential Kaldi and OpenFst types in Python. PyKaldi vector and matrix types are tightly integrated with NumPy. They can be seamlessly converted to NumPy arrays and vice versa without copying the underlying memory buffers. PyKaldi FST types, including Kaldi style lattices, are first class citizens in Python. The API for the user facing FST types and operations is almost entirely defined in Python mimicking the API exposed by pywrapfst, the official Python wrapper for OpenFst.

PyKaldi harnesses the power of CLIF to wrap Kaldi and OpenFst C++ libraries using simple API descriptions. The CPython extension modules generated by CLIF can be imported in Python to interact with Kaldi and OpenFst. While CLIF is great for exposing existing C++ API in Python, the wrappers do not always expose a "Pythonic" API that is easy to use from Python. PyKaldi addresses this by extending the raw CLIF wrappers in Python (and sometimes in C++) to provide a more "Pythonic" API. Below figure illustrates where PyKaldi fits in the Kaldi ecosystem.

PyKaldi has a modular design which makes it easy to maintain and extend. Source files are organized in a directory tree that is a replica of the Kaldi source tree. Each directory defines a subpackage and contains only the wrapper code written for the associated Kaldi library. The wrapper code consists of:

CLIF C++ API descriptions defining the types and functions to be wrapped and their Python API,
C++ headers defining the shims for Kaldi code that is not compliant with the Google C++ style expected by CLIF,
Python modules grouping together related extension modules generated with CLIF and extending the raw CLIF wrappers to provide a more "Pythonic" API.

You can read more about the design and technical details of PyKaldi in our paper.

Coverage Status

The following table shows the status of each PyKaldi package (we currently do not plan to add support for nnet, nnet2 and online) along the following dimensions:

Wrapped?: If there are enough CLIF files to make the package usable in Python.
Pythonic?: If the package API has a "Pythonic" look-and-feel.
Documentation?: If there is documentation beyond what is automatically generated by CLIF. Single checkmark indicates that there is not much additional documentation (if any). Three checkmarks indicates that package documentation is complete (or near complete).
Tests?: If there are any tests for the package.

Package	Wrapped?	Pythonic?	Documentation?	Tests?
base	✔	✔	✔ ✔ ✔	✔
chain	✔	✔	✔ ✔ ✔
cudamatrix	✔		✔	✔
decoder	✔	✔	✔ ✔ ✔
feat	✔	✔	✔ ✔ ✔
fstext	✔	✔	✔ ✔ ✔
gmm	✔	✔	✔ ✔	✔
hmm	✔	✔	✔ ✔ ✔	✔
ivector	✔		✔
kws	✔	✔	✔ ✔ ✔
lat	✔	✔	✔ ✔ ✔
lm	✔	✔	✔ ✔ ✔
matrix	✔	✔	✔ ✔ ✔	✔
nnet3	✔		✔	✔
online2	✔	✔	✔ ✔ ✔
rnnlm	✔	✔	✔ ✔ ✔
sgmm2	✔		✔
tfrnnlm	✔	✔	✔ ✔ ✔
transform	✔	✔	✔
tree	✔		✔
util	✔	✔	✔ ✔ ✔	✔

Installation

If you are using a relatively recent Linux or macOS, such as Ubuntu >= 16.04, CentOS >= 7 or macOS >= 10.13, you should be able to install PyKaldi without too much trouble. Otherwise, you will likely need to tweak the installation scripts.

Pip / whl packages

You can now download official whl packages from our github release page. We have whl packages for Python 3.7, 3.8, ... , 3.11 on Linux and a few (experimental) builds for Mac M1/M2.

If you decide to use a whl package then you can skip the next sections and head straight to "Starting a new project with a pykaldi whl package" to setup your project. Note that you still need to compile a PyKaldi-compatible version of Kaldi.

From Source

To install and build PyKaldi from source, follow the steps given below.

Step 1: Clone PyKaldi Repository and Create a New Python Environment

git clone https://github.com/pykaldi/pykaldi.git
cd pykaldi

Although it is not required, we recommend installing PyKaldi and all of its Python dependencies inside a new isolated Python environment. If you do not want to create a new Python environment, you can skip the rest of this step.

You can use any tool you like for creating a new Python environment. Here we use virtualenv, but you can use another tool like conda if you prefer that. Make sure you activate the new Python environment before continuing with the rest of the installation.

virtualenv env
source env/bin/activate

Step 2: Install Dependencies

Running the commands below will install the system packages needed for building PyKaldi from source.

# Ubuntu
sudo apt-get install autoconf automake cmake curl g++ git graphviz \
    libatlas3-base libtool make pkg-config subversion unzip wget zlib1g-dev

# macOS
brew install automake cmake git graphviz libtool pkg-config wget gnu-sed openblas subversion
PATH="/opt/homebrew/opt/gnu-sed/libexec/gnubin:$PATH"

Running the commands below will install the Python packages needed for building PyKaldi from source.

pip install --upgrade pip
pip install --upgrade setuptools
pip install numpy pyparsing
pip install ninja  # not required but strongly recommended

In addition to above listed packages, we also need PyKaldi compatible installations of the following software:

Google Protobuf, recommended v3.5.0. Both the C++ library and the Python package must be installed.
PyKaldi compatible fork of CLIF. To streamline PyKaldi development, we made some changes to CLIF codebase. We are hoping to upstream these changes over time. These changes are in the pykaldi branch:

# This command will be automatically run for you in the tools install scripts.
git clone -b pykaldi https://github.com/pykaldi/clif

PyKaldi compatible fork of Kaldi. To comply with CLIF requirements we had to make some changes to Kaldi codebase. We are hoping to upstream these changes over time.These changes are in the pykaldi branch:

# This command will be automatically run for you in the tools install scripts.
git clone -b pykaldi https://github.com/pykaldi/kaldi

You can use the scripts in the tools directory to install or update these software locally. Make sure you check the output of these scripts. If you do not see Done installing {protobuf,CLIF,Kaldi} printed at the very end, it means that installation has failed for some reason.

cd tools
./check_dependencies.sh  # checks if system dependencies are installed
./install_protobuf.sh    # installs both the C++ library and the Python package
./install_clif.sh        # installs both the C++ library and the Python package
./install_kaldi.sh       # installs the C++ library
cd ..

Note, if you are compiling Kaldi on Apple Silicion and ./install_kaldi.sh gets stuck right at the beginning compiling sctk, you might need to remove -march=native from tools/kaldi/tools/Makefile, e.g. by uncommeting it in this line like this:

SCTK_CXFLAGS = -w #-march=native

Step 3: Install PyKaldi

If Kaldi is installed inside the tools directory and all Python dependencies (numpy, pyparsing, pyclif, protobuf) are installed in the active Python environment, you can install PyKaldi with the following command.

python setup.py install

Once installed, you can run PyKaldi tests with the following command.

python setup.py test

You can then also create a whl package. The whl package makes it easy to install pykaldi into a new project environment for your speech project.

python setup.py bdist_wheel

The whl file can then be found in the "dist" folder. The whl filename depends on the pykaldi version, your Python version and your architecture. For a Python 3.9 build on x86_64 with pykaldi 0.2.2 it may look like: dist/pykaldi-0.2.2-cp39-cp39-linux_x86_64.whl

Starting a new project with a pykaldi whl package

Create a new project folder, for example:

mkdir -p ~/projects/myASR
cd ~/projects/myASR

Create and activate a virtual environment with the same Python version as the whl package, e.g for Python 2.9:

virtualenv -p /usr/bin/python3.9 myasr_env
. myasr_env/bin/activate

Install numpy and pykaldi into your myASR environment:

pip3 install numpy
pip3 install pykaldi-0.2.2-cp39-cp39-linux_x86_64.whl

Copy pykaldi/tools/install_kaldi.sh to your myASR project. Use the install_kaldi.sh script to install a pykaldi compatible kaldi version for your project:

./install_kaldi.sh

Copy pykaldi/tools/path.sh to your project. Path.sh is used to make pykaldi find the Kaldi libraries and binaries in the kaldi folder. Source path.sh with:

. path.sh

Congratulations, you are ready to use pykaldi in your project!

Note: Anytime you open a new shell, you need to source the project environment and path.sh:

. myasr_env/bin/activate
. path.sh

Conda

Note: Unfortunatly, the PyKaldi Conda packages are outdated. If you would like to maintain it, please get in touch with us.

To install PyKaldi with CUDA support:

conda install -c pykaldi pykaldi

To install PyKaldi without CUDA support (CPU only):

conda install -c pykaldi pykaldi-cpu

Note that PyKaldi conda package does not provide Kaldi executables. If you would like to use Kaldi executables along with PyKaldi, e.g. as part of read/write specifiers, you need to install Kaldi separately.

Docker

Note: The docker instructions below may be outdated. If you would like to maintain a docker image for PyKaldi, please get in touch with us.

If you would like to use PyKaldi inside a Docker container, follow the instructions in the docker folder.

FAQ

How do I prevent PyKaldi install command from exhausting the system memory?

By default, PyKaldi install command uses all available (logical) processors to accelerate the build process. If the size of the system memory is relatively small compared to the number of processors, the parallel compilation/linking jobs might end up exhausting the system memory and result in swapping. You can limit the number of parallel jobs used for building PyKaldi as follows:

MAKE_NUM_JOBS=2 python setup.py install

How do I build PyKaldi on Windows?

We have no idea what is needed to build PyKaldi on Windows. It would probably require lots of changes to the build system.

How do I build PyKaldi using a different Kaldi installation?

At the moment, PyKaldi is not compatible with the upstream Kaldi repository. You need to build it against our Kaldi fork.

If you already have a compatible Kaldi installation on your system, you do not need to install a new one inside the pykaldi/tools directory. Instead, you can simply set the following environment variable before running the PyKaldi installation command.

export KALDI_DIR=<directory where Kaldi is installed, e.g. "$HOME/tools/kaldi">

How do I build PyKaldi using a different CLIF installation?

At the moment, PyKaldi is not compatible with the upstream CLIF repository. You need to build it using our CLIF fork.

If you already have a compatible CLIF installation on your system, you do not need to install a new one inside the pykaldi/tools directory. Instead, you can simply set the following environment variables before running the PyKaldi installation command.

export PYCLIF=<path to pyclif executable, e.g. "$HOME/anaconda3/envs/clif/bin/pyclif">
export CLIF_MATCHER=<path to clif-matcher executable, e.g. "$HOME/anaconda3/envs/clif/clang/bin/clif-matcher">

How do I update Protobuf, CLIF or Kaldi used by PyKaldi?

While the need for updating Protobuf and CLIF should not come up very often, you might want or need to update Kaldi installation used for building PyKaldi. Rerunning the relevant install script in tools directory should update the existing installation. If this does not work, please open an issue.

How do I build PyKaldi with Tensorflow RNNLM support?

PyKaldi tfrnnlm package is built automatically along with the rest of PyKaldi if kaldi-tensorflow-rnnlm library can be found among Kaldi libraries. After building Kaldi, go to KALDI_DIR/src/tfrnnlm/ directory and follow the instructions given in the Makefile. Make sure the symbolic link for the kaldi-tensorflow-rnnlm library is added to the KALDI_DIR/src/lib/ directory.

Projects using PyKaldi

Shennong - a toolbox for speech features extraction, like MFCC, PLP etc. using PyKaldi.

Kaldi model server - a threaded kaldi model server for live decoding. Can directly decode speech from your microphone with a nnet3 compatible model. Example models for English and German are available. Uses the PyKaldi online2 decoder.

MeetingBot - example of a web application for meeting transcription and summarization that makes use of a pykaldi/kaldi-model-server backend to display ASR output in the browser.

Subtitle2go - automatic subtitle generation for any media file. Uses PyKaldi for ASR with a batch decoder.

If you have a cool open source project that makes use of PyKaldi that you'd like to showcase here, let us know!

Citing

If you use PyKaldi for research, please cite our paper as follows:

@inproceedings{pykaldi,
  title = {PyKaldi: A Python Wrapper for Kaldi},
  author = {Dogan Can and Victor R. Martinez and Pavlos Papadopoulos and
            Shrikanth S. Narayanan},
  booktitle={Acoustics, Speech and Signal Processing (ICASSP),
             2018 IEEE International Conference on},
  year = {2018},
  organization = {IEEE}
}

Contributing

We appreciate all contributions! If you find a bug, feel free to open an issue or a pull request. If you would like to request or add a new feature please open an issue for discussion.

pykaldi's People

Contributors

Stargazers

Watchers

Forkers

lguyogiro ruchirtravadi r9y9 pikaliov colinsongf niucheney fgvbrt yanchaomars hlthu crim-ca entn-at reloadbrain deeplearningsprint 0x38 damingyang papadopav sipida cbsandeep10 mbencherif mrzyzhaozeyu ag027592 thetimeofblack rachine aiedward fotwo jeffersonchou yongxuustc sunkukmoon itsmesatwik pswietojanski rpersie caizexin fcw11 jiepu2 wantongtang peidong-wang cyzhang9999 ajilim xiaotingfu makinglong ytyeung zerrui nisoka agangzz becky07 walden2013 del18687058912 yunjiwang jerrypeng21cuhk huramba bancherd-delong taomanwai phranik lldleo yfliao guanlongzhao tvanh512 guker jr209152 mycrazycracy kutim pfriesch prismdata novolearning sadam1195 whu933314 matth jzlianglu zhao-yajun hwiorn normonisping speech-tech alex-ht tjadamlee shadow-d ts00193189 shiyang1983 jinsongpan xiaoyeye1117 slegroux aitorbajo zn787726661 cryptowww shutiwu nigulasichen sarah-xing bin2000 forevercontigo gavinljj zhangsuzerain ondrejklejch yotam319 yt-hwang pcerles hanv89 lvchigo glynpu rxma1805 madkote mahbubnoor

pykaldi's Issues

ValueError: zero-size array to reduction operation minimum which has no identity

MWE:

from kaldi.matrix import Matrix
m = Matrix()
m

Traceback:

Out[3]: ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/miniconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    691                 type_pprinters=self.type_printers,
    692                 deferred_pprinters=self.deferred_printers)
--> 693             printer.pretty(obj)
    694             printer.flush()
    695             return stream.getvalue()

~/miniconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    378                             if callable(meth):
    379                                 return meth(obj, self, cycle)
--> 380             return _default_pprint(obj, self, cycle)
    381         finally:
    382             self.end_group()

~/miniconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _default_pprint(obj, p, cycle)
    493     if _safe_getattr(klass, '__repr__', None) is not object.__repr__:
    494         # A user-provided repr. Find newlines and replace them with p.break_()
--> 495         _repr_pprint(obj, p, cycle)
    496         return
    497     p.begin_group(1, '<')

~/miniconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    691     """A pprint that just redirects to the normal repr function."""
    692     # Find newlines and replace them with p.break_()
--> 693     output = repr(obj)
    694     for idx,output_line in enumerate(output.splitlines()):
    695         if idx:

~/Workspace/pykaldi/kaldi/matrix/__init__.py in __repr__(self)
    493 
    494     def __repr__(self):
--> 495         return str(self)
    496 
    497     def __str__(self):

~/Workspace/pykaldi/kaldi/matrix/__init__.py in __str__(self)
    502         # https://github.com/pytorch/pytorch/blob/master/torch/tensor.py
    503         if sys.version_info > (3,):
--> 504             return _str._matrix_str(self)
    505         else:
    506             if hasattr(sys.stdout, 'encoding'):

~/Workspace/pykaldi/kaldi/matrix/_str.py in _matrix_str(self, indent, formatter, force_truncate)
    156     if formatter is None:
    157         fmt, scale, sz = _number_format(self,
--> 158                                         min_sz=5 if not print_full_mat else 0)
    159     else:
    160         fmt, scale, sz = formatter

~/Workspace/pykaldi/kaldi/matrix/_str.py in _number_format(self, min_sz)
     92             break
     93 
---> 94     exp_min = temp.min()
     95     if exp_min != 0:
     96         exp_min = math.floor(math.log10(exp_min)) + 1

~/miniconda3/lib/python3.6/site-packages/numpy/core/_methods.py in _amin(a, axis, out, keepdims)
     27 
     28 def _amin(a, axis=None, out=None, keepdims=False):
---> 29     return umr_minimum(a, axis, None, out, keepdims)
     30 
     31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):

ValueError: zero-size array to reduction operation minimum which has no identity

fatal error: unicode/ucnv.h: No such file or directory

Getting the following error when running the install_clif script.

(sfr_env) paperspace@ps3mpwqc8:~/SFR/start_follow_read/pykaldi/tools$ ./install_clif.sh
Destination /home/paperspace/SFR/start_follow_read/pykaldi/tools/clif already exists.
Already up-to-date.
/home/paperspace/anaconda3/envs/sfr_env/bin/ninja
Using ninja for the clif backend build.
Checked out revision 307315.
Checked out revision 307315.
-- Native target architecture is X86
-- Threads enabled.
-- Doxygen disabled.
-- Go bindings disabled.
-- Could NOT find OCaml (missing: OCAMLFIND OCAML_VERSION OCAML_STDLIB_PATH)
-- Could NOT find OCaml (missing: OCAMLFIND OCAML_VERSION OCAML_STDLIB_PATH)
-- OCaml bindings disabled.
-- LLVM host triple: x86_64-unknown-linux-gnu
-- LLVM default target triple: x86_64-unknown-linux-gnu
-- Building with -fPIC
-- Constructing LLVMBuild project information
-- Targeting X86
-- Could NOT find Z3 (missing: Z3_LIBRARIES Z3_INCLUDE_DIR) (Required is at least version "4.5")
-- Clang version: 5.0.0
-- Configuring done
-- Generating done
-- Build files have been written to: /home/paperspace/SFR/start_follow_read/pykaldi/tools/clif_backend/build_matcher
[1/11] Generating SVNVersion.inc
-- Found Subversion: /usr/bin/svn (found version "1.9.3")
[1/184] Building C object tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/c-index-test.c.o
FAILED: tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/c-index-test.c.o
/usr/bin/cc -DCLANG_ENABLE_ARCMT -DCLANG_ENABLE_OBJC_REWRITER -DCLANG_ENABLE_STATIC_ANALYZER -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_GLOBAL_ISEL -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/tools/c-index-test -I/home/paperspace/SFR/start_follow_read/pykaldi/tools/clif_backend/llvm/tools/clang/tools/c-index-test -I/home/paperspace/SFR/start_follow_read/pykaldi/tools/clif_backend/llvm/tools/clang/include -Itools/clang/include -Iinclude -I/home/paperspace/SFR/start_follow_read/pykaldi/tools/clif_backend/llvm/include -isystem /home/paperspace/anaconda3/include/libxml2 -fPIC -Werror=date-time -Wall -W -Wno-unused-parameter -Wwrite-strings -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-comment -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=gnu89 -MMD -MT tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/c-index-test.c.o -MF tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/c-index-test.c.o.d -o tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/c-index-test.c.o -c /home/paperspace/SFR/start_follow_read/pykaldi/tools/clif_backend/llvm/tools/clang/tools/c-index-test/c-index-test.c
In file included from /home/paperspace/anaconda3/include/libxml2/libxml/parser.h:810:0,
from /home/paperspace/SFR/start_follow_read/pykaldi/tools/clif_backend/llvm/tools/clang/tools/c-index-test/c-index-test.c:15:
/home/paperspace/anaconda3/include/libxml2/libxml/encoding.h:31:26: fatal error: unicode/ucnv.h: No such file or directory
compilation terminated.
[2/184] Linking CXX executable bin/verify-uselistorder
ninja: build stopped: subcommand failed.

Phoneme Aligner example

Considering other (closed) issues that seemed to be making similar requests (that were denied), I'm guessing I'm not sure this is a legitimate request, but I'll regret not asking since I really need this.

Would you guys be able to help me write (or even better, write yourself 😛 ) an example file for phoneme forced alignment with PyKaldi?

Unit tests

I think it is a good time to start writing unit tests to check for regressions.

@vrmpx Can you come up with a scaffold for the tests? Maybe we can do something like this.

Use Cmake for building extension modules

Building pykaldi takes a long time since we can not parallelize the build using setuptools. Also our current build system does not detect the changes in some dependencies like the C++ headers in pykaldi source tree. Building pykaldi will become much worse as the code base grows. We should try and see if we can build the extension modules in CMake.

Implement

Hi, I noticed that nnet3 is implement in pykaldi. Is there any chance to also include nnet3bin (kaldi)? Furthermore, can xvector pipeline be implemented in pykaldi? Thank you very much.

How to decode LDA model ?

Hello,
Can you please give me a hint how to decode an LDA/MLLR model ?

In the "steps/decode.sh", the following features are used (from WSJ recipe):

  lda) feats="ark,s,cs:apply-cmvn $cmvn_opts --utt2spk=ark:$sdata/JOB/utt2spk scp:$sdata/JOB/cmvn.scp scp:$sdata/JOB/feats.scp ark:- | splice-feats $splice_opts ark:- ark:- | transform-feats $srcdir/final.mat ark:- ark:- |";;

While I think I found the function for doing the splice, I have no idea how to do the "transform-feats".

def make_feat_pipeline(base, opts=DeltaFeaturesOptions()):
    def feat_pipeline(wav):
        feats = base.compute_features(wav.data()[0], wav.samp_freq, 1.0)
        return splice_frames(feats, 4, 4)
    return feat_pipeline

Can you please provide some hints where to start ?
Thanks

How to add kaldi functionality?

Hi, let's say I wanted to add this function: http://kaldi-asr.org/doc/lattice-to-post_8cc.html
to pykaldi with python bindings. Is there some documentation I could go through to generate the python binding? Thank you!

The compatibility of pykaldi and kaldi5.1 issues

Calculating GOP requires kaldi5.1 and the mfcc calculated by kaldi 5.1 and the latest version of kaldi
is different. So i must use kaldi 5.1 to calculate mfcc. I have tried two method:

(1)Replace the whole tools/kaldi with kaldi5.1.
(2)Only replace tools/kaldi/src/featbins/compute-mfcc-feats with compute-mfcc-feats in kaldi5.1.

But neither of these methods works. Is there any way to make pykaldi compatible with kaldi5.1, I only need the the module of calculating mfcc.

Thank you for help!

Request support online2-wav-nnet3-latgen-faster

Hello,

Can you reimplement online2-wav-nnet3-latgen-faster in your python library?

conda install fail

conda install -c pykaldi pykaldi
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

pykaldi

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

ImportError: dynamic module does not define init function (PyInit__kaldi_error)

Can someone help me in Resolving this error?

htk to numpy

Hi,
how do i convert an ark file or htk file to numpy array?

import kaldi
import numpy

f1="This_is_htk_file.lps"
f2="This_is_ark_file.ark"

How do you build `kaldi/lib/_clif.so`?

I am trying to build a wheel for this package. I managed to encapsulate kaldi shared lib into the wheel using auditwheel. Unfortunately it still fails with

    from ._kaldi_error import *
ImportError: _clif.so: cannot open shared object file: No such file or directory

I don't understand how you build kaldi/lib/_clif.so from sources in pykaldi/kaldi/lib/clif/python/. Is there some code missing in the repo?

installation fail

Hi, I can not install pykaldi.

when running python setup.py install I get error

Using PYCLIF: /home/mpavlov/anaconda/envs/test/bin/pyclif
Using CLIF_MATCHER: /home/mpavlov/anaconda/envs/test/clang/bin/clif-matcher
-- Configuring done
-- Generating done
-- Build files have been written to: /home/mpavlov/voice/pykaldi/build
ninja: error: '/home/mpavlov/anaconda/envs/kaldi/lib/libpython3.5m.so', needed by 'lib/kaldi/_clif.so', missing and no known rule to make it
Command '['ninja']' returned non-zero exit status 1

subprocess.CalledProcessError: Command '['which', 'pyclif']' returned non-zero exit status 1.

(base) root@analytics-server:~/speechToText/pykaldi# python3 setup.py install
Traceback (most recent call last):
File "setup.py", line 44, in
PYCLIF = check_output(['which', 'pyclif'])
File "setup.py", line 19, in check_output
return subprocess.check_output(*args, **kwargs).decode("utf-8").strip()
File "/root/anaconda3/lib/python3.7/subprocess.py", line 376, in check_output
**kwargs).stdout
File "/root/anaconda3/lib/python3.7/subprocess.py", line 468, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['which', 'pyclif']' returned non-zero exit status 1.

Lattice to_bytes from_bytes doesn't work

I'm trying to serialize a lattice into a string:

print("type(lattice)", type(lattice))
lattice_bytes = lattice.to_bytes()
lat = CompactLatticeVectorFst.from_bytes(lattice_bytes)
print(lat)

And I get this error:

type(lattice) <class 'kaldi.fstext.CompactLatticeVectorFst'>
ERROR: GenericRegister::GetEntry: vector-fst.so: cannot open shared object file: No such file or directory
ERROR: Fst::Read: Unknown FST type vector (arc type = compactlattice44): StringToFst

GMM models return unwrapped versions of kaldi_vector.Vector and kaldi_matrix.Matrix

Simple example:

gmm = FullGmm()
w = gmm.weights_
print(type(w))

Output:

kaldi_vector.Vector

kaldi-matrix-ext.cc has a hard-coded path

In Kaldi-matrix-ext.cc:

#include <Python.h>
#include "clif/python/ptr_util.h"
#include "clif/python/optional.h"
#include "/home/dogan/anaconda2/envs/clif/python/types.h"

How to get the value of gmm parameters?

The get_means(), get_vars() and weights_ for GMM classes return _kaldi_matrix.Matrix and _kaldi_vector.Vector instances rather than Matrix or Vector instances. These instances don't support numpy() conversion. I looked through all their member value and functions, but couldn't figure how to get their values into numpy array or print to screen.
Can anyone help me with this problem? Thanks~

Installation taking up too much memory

The final installation step python setup.py install takes up too much memory, my system kind of hangs.

New decodable interface

Hey, I'm trying to implement a (very simple) decodable interface that just wraps a 2D numpy array of log-likelihoods. However, after I instantiate the lattice decoder object and call

decoder.decode(loglikeDecodable)

I always get this error:
ValueError: decode() argument decodable is not valid: Value invalidated due to capture by std::unique_ptr.

Any ideas why this might be happening? Tried to find examples but couldn't anywhere. Attached the simple class definition below.

from kaldi.itf._decodable_itf import DecodableInterface
class DecodableLogLike(DecodableInterface):
	def __init__(self, loglikes):
		"""
                 loglikes have shape (b, T, C)
		b: batch size
		T: frame length
		C: alphabet size
		"""
		self.loglikes = loglikes[0,:,:]
		self.T = loglikes.shape[0]
		self.C = loglikes.shape[1]

	def is_last_frame(self, frame):
		"""
		Params:
			frame: int
		"""
		if frame == self.T - 1:
			return True
		else:
			return False

	def log_likelihood(self, frame, index):
		"""
		Params:
			frame: int
			index: int
		"""
		return self.loglikes[frame, index]

	def num_frames_ready(self):
		return self.T

	def num_indices(self):
		return self.C

A bug with compute_deltas

the output of compute_deltas is a class of _kaldi_matrix.Matrix
AttributeError: '_kaldi_matrix.Matrix' object has no attribute 'numpy'.

Python Matrix class is missing methods referring to other types.

We need to do something similar to what we did for the Vector class in kaldi/matrix/kaldi-vector-ext.h to wrap methods that we could not wrap in kaldi/matrix/kaldi-vector.clif.

No such file or directory: 'kaldi/version.py'

when i run "python setup.py install",it excepts this error.
I m sure that I have install all of c++ library before execute this command.
and I have run"find" and "locate" to find version.py,but it has no result.
Please give me some advise,and what should I do now???

List the error traceBack as follow:

Traceback (most recent call last):
File "setup.py", line 309, in
with open(os.path.join('kaldi', 'version.py')) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'kaldi/version.py'

Support of latgen-faster-mapped

Are there any plan for PyKaldi to support latgen-faster-mapped function? It is under bin package.

Calling ostringstream.to_str() after binary write causes UnicodeDecodeError

Maybe this is the intended behavior? However, I do not get that from reading Kaldi's test code

#!/usr/bin/env python3
from kaldi.hmm.topology import HmmTopology
from kaldi.base.io import istringstream, ostringstream

input_str = """<Topology>
        <TopologyEntry>
        <ForPhones> 1 2 3 4 5 6 7 8 9 </ForPhones>
        <State> 0 <PdfClass> 0 
        <Transition> 0 0.5
        <Transition> 1 0.5
        </State>
        <State> 1 <PdfClass> 1
        <Transition> 1 0.5
        <Transition> 2 0.5
        </State>
        <State> 2 <PdfClass> 2
        <Transition> 2 0.5
        <Transition> 3 0.5
        </State>
        <State> 3 </State>
        </TopologyEntry>
        <TopologyEntry>
        <ForPhones> 10 11 13 </ForPhones>
        <State> 0 <PdfClass> 0
        <Transition> 0 0.5
        <Transition> 1 0.5
        </State>
        <State> 1 <PdfClass> 1
        <Transition> 1 0.5
        <Transition> 2 0.5
        </State>
        <State> 2 </State>
        </TopologyEntry>
        </Topology>"""

iss = istringstream.new_from_str(input_str)
oss = ostringstream()

topo = HmmTopology()
topo.read(iss, binary = False)
topo.write(oss, binary = True)

print(oss.to_str())

Error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 69: invalid start byte

While I haven't got to test it, I'm pretty sure this will also happen in nnet3

nnet3 online with endpointing fails

Hi @dogancan
We tried to run your example nnet3 online decoding script.
The first two methods work, but the third one with endpointing fails with the following error:

ASSERTION_FAILED ([5.4.49~2-5eb7]:EnsureFrameIsComputed():nnet3/decodable-online-looped.h:94) : 'subsampled_frame >= current_log_post_subsampled_offset_ && "Frames must be accessed in order."' 
Traceback (most recent call last):
  File "./nnet3-online-recognizer.py", line 121, in <module>
    asr.advance_decoding()
  File "/mosesgpu-184-storage/kaldi_tools/online/env3/lib/python3.6/site-packages/pykaldi-0.0.9-py3.6-linux-x86_64.egg/kaldi/asr.py", line 1202, in advance_decoding
    self.decoder.advance_decoding(self._decodable, max_num_frames)
RuntimeError: C++ exception: 
WARNING ([5.4.49~2-5eb7]:~HashList():util/hash-list-inl.h:117) Possible memory leak: 18431 != 18432: you might have forgotten to call Delete on some Elems

We've been playing around a bit with the NnetLatticeFasterOnlineRecognizer class and we noticed the following behaviour:

when we call finalize_decoding() and then init_decoding() it fails with the same error as soon as we try to advance_decoding() with new features;
the pipe finalize_decoding() get_output() init_decoding() works, but the whole context since the beginning of the decoding remains, whereas we thought it would 'reset' the utterance by calling finalize_decoding().
Is it possible with the existing code and wrappers to 'reset' the utterance inside the same NnetLatticeFasterOnlineRecognizer object?

Size method of SpMatrix

Hi. I fitted full gmm with kaldi, then extracted covariance matrix and tried to call size method but failed. According to documentation this method should exist: (https://pykaldi.github.io/api/kaldi.matrix.html#kaldi.matrix.packed.SpMatrix.size)
Is it expected behavior?

I just wanted to convert SpMatrix to Matrix and then to numpy array. But when I call constructor of Matrix with SpMatrix I get error that there is no attribute of size.

Here is code:

gmm = FullGmm()
with xopen('final.ubm') as f:
    gmm.read(f.stream(), f.binary)
covar = gmm.get_covars()[0]
print (covar.size)

error:

AttributeError Traceback (most recent call last)
in ()
3 gmm.read(f.stream(), f.binary)
4 covar = gmm.get_covars()[0]
----> 5 print (covar.size)
AttributeError: '_sp_matrix.SpMatrix' object has no attribute 'size'

DecodableNnetSimpleLoopedInfo not valid for AmNnetSimple model

My code is:

from kaldi.nnet3 import DecodableNnetSimpleLoopedInfo as from_am
decodable_opts = NnetSimpleLoopedComputationOptions()
decodable_opts.frames_per_chunk = 20
decodable_opts.extra_left_context_initial = 10
decodable_opts.frame_subsampling_factor = 1
decodable_opts.acoustic_scale = 1.0

acoustic_model = AmNnetSimple()
decodable_info = from_am(decodable_opts, acoustic_model)

and the error report is:

decodable_info = from_am(decodable_opts, acoustic_model)
TypeError: init() argument nnet is not valid for ::kaldi::nnet3::Nnet * (_am_nnet_simple.AmNnetSimple instance given): expecting _nnet_nnet.Nnet instance, got _am_nnet_simple.AmNnetSimple instance

in the /nnet3/decodable-simple-looped.clif, it shows

@add__init__
def DecodableNnetSimpleLoopedInfo as from_am(
self, opts: NnetSimpleLoopedComputationOptions, nnet: AmNnetSimple)

so AmNnetSimple should work....I dont know where is the problem

Runtime Error using nnet3 decoder with Cuda

Hi,

I got a Runtime error when running a nnet3 based decoder using the GPU. My model works fine running on a CPU, but once I set CuDevice.instantiate().select_gpu_id('yes'), the following error occurs:

WARNING ([5.4.45~1-6e639]:SelectGpuId():cu-device.cc:192) Not in compute-exclusive mode.  Suggestion: use 'nvidia-smi -c 3' to set compute exclusive mode
LOG ([5.4.45~1-6e639]:SelectGpuIdAuto():cu-device.cc:311) Selecting from 1 GPUs
LOG ([5.4.45~1-6e639]:SelectGpuIdAuto():cu-device.cc:326) cudaSetDevice(0): GeForce GTX 1080 Ti	free:10734M, used:430M, total:11164M, free/total:0.961468
LOG ([5.4.45~1-6e639]:SelectGpuIdAuto():cu-device.cc:375) Trying to select device: 0 (automatically), mem_ratio: 0.961468
LOG ([5.4.45~1-6e639]:SelectGpuIdAuto():cu-device.cc:394) Success selecting device 0 free mem ratio: 0.961468
LOG ([5.4.45~1-6e639]:FinalizeActiveGpu():cu-device.cc:243) The active GPU is [0]: GeForce GTX 1080 Ti	free:10670M, used:494M, total:11164M, free/total:0.955735 version 6.1
audio length: 0.760000 s
ERROR ([5.4.45~1-6e639]:RemoveLeastRecentlyUsed():cu-allocator.cc:349) cudaError_t 77 : "an illegal memory access was encountered" returned from 'cudaFree(p.first.pointer)'
WARNING ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:435) Printing some background info since error was detected
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:436) matrix m1(1, 100), m2(70, 40), m3(70, 300), m4(66, 1536), m5(64, 1536), m6(62, 1536), m7(56, 1536), m8(50, 512), m9(50, 5686), m10(50, 5686)
# The following show how matrices correspond to network-nodes and
# cindex-ids.  Format is: matrix = <node-id>.[value|deriv][ <list-of-cindex-ids> ]
# where a cindex-id is written as (n,t[,x]) but ranges of t values are compressed
# so we write (n, tfirst:tlast).
m1 == value: ivector[(0,0)]
m2 == value: input[(0,-13:56)]
m3 == value: tdnn1.affine_input[(0,-11:54), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648)]
m4 == value: tdnn2.affine_input[(0,-10:53), (0,-2147483648), (0,-2147483648)]
m5 == value: tdnn3.affine_input[(0,-9:52), (0,-2147483648), (0,-2147483648)]
m6 == value: tdnn4.affine_input[(0,-6:49), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648)]
m7 == value: tdnn5.affine_input[(0,0:49), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648), (0,-2147483648)]
m8 == value: tdnn5.affine[(0,0:49)]
m9 == value: output.affine[(0,0:49)]
m10 == value: output.log-softmax[(0,0:49)]
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c0: m1 = user input [for node: 'ivector']
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c1: m2 = user input [for node: 'input']
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c2: [no-op-permanent]
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c3: m3 = undefined(70,300)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c4: m3(0:69, 0:39) = m2
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c5: m3(0:65, 40:79) = m2(1:66, 0:39)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c6: m3(0:65, 80:119) = m2(2:67, 0:39)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c7: m3(0:65, 120:159) = m2(3:68, 0:39)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c8: m3(0:65, 160:199) = m2(4:69, 0:39)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c9: m2 = []
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c10: m3(0:65, 200:299).CopyRows(1, m1[0x66])
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c11: m1 = []
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c12: m4 = undefined(66,1536)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c13: lda.tdnn1.affine.Propagate(NULL, m3(0:65, 0:299), &m4(0:65, 0:511))
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c14: m3 = []
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c15: tdnn1.relu.Propagate(NULL, m4(0:65, 0:511), &m4(0:65, 0:511))
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c16: m4(0:63, 512:1023) = m4(1:64, 0:511)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c17: m4(0:63, 1024:1535) = m4(2:65, 0:511)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c18: m5 = undefined(64,1536)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c19: tdnn1.batchnorm.tdnn2.affine.Propagate(NULL, m4(0:63, 0:1535), &m5(0:63, 0:511))
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c20: m4 = []
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c21: tdnn2.relu.Propagate(NULL, m5(0:63, 0:511), &m5(0:63, 0:511))
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c22: m5(0:61, 512:1023) = m5(1:62, 0:511)
LOG ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:438) c23: m5(0:61, 1024:1535) = m5(2:63, 0:511)
ERROR ([5.4.45~1-6e639]:ExecuteCommand():nnet-compute.cc:442) Error running command c24: m6 = undefined(62,1536)
Traceback (most recent call last):
  File "scoreit_pykaldi.py", line 102, in <module>
    kaldi_model=kaldi_model)
  File "scoreit_pykaldi.py", line 37, in run_pykaldi
    kaldi_model.nnet3_decode(utterance)
  File "/home/eflsd/dev/Projects/librispeech/dev/pyscripts/interface/KaldiModel.py", line 161, in nnet3_decode
    self.decoder.decode(nnet_decodable)
RuntimeError: C++ exception: 
ERROR ([5.4.45~1-6e639]:Free():cu-allocator.cc:285) Attempt to free CUDA memory pointer that was not allocated: 0x55c38fd02d90
terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

Do you know why this is happening?

CLIF installation failing due to compatibility issue with Protobuf

"$MAKE_OR_NINJA" "${MAKE_PARALLELISM[@]}" clif-matcher clif_python_utils_proto_util line 144 in pykaldi/install_clif.sh failing with the following error.

[1785/1788] Generating proto_util.cc, proto_util.h, proto_util.init.cc
FAILED: tools/clif/python/utils/proto_util.cc tools/clif/python/utils/proto_util.h tools/clif/python/utils/proto_util.init.cc 
cd /pykaldi/tools/clif_backend/build_matcher/tools/clif/python/utils && PYTHONPATH=/pykaldi/tools/clif_backend/build_matcher/tools:/pykaldi/tools/clif_backend/llvm/tools /opt/conda/bin/python /pykaldi/tools/clif_backend/llvm/tools/clif/pyclif.py -p/pykaldi/tools/clif_backend/llvm/tools/clif/python/types.h -c/pykaldi/tools/clif_backend/build_matcher/tools/clif/python/utils/proto_util.cc -g/pykaldi/tools/clif_backend/build_matcher/tools/clif/python/utils/proto_util.h -i/pykaldi/tools/clif_backend/build_matcher/tools/clif/python/utils/proto_util.init.cc -I/pykaldi/tools/clif_backend/llvm/tools -I/pykaldi/tools/clif_backend/build_matcher/tools --modname=clif.python.utils.proto_util --matcher_bin=/pykaldi/tools/clif_backend/build_matcher/bin/clif-matcher "-f-I/opt/conda/include/python3.6m -I/pykaldi/tools/clif_backend/llvm/tools -I/pykaldi/tools/clif_backend/build_matcher/tools -I/pykaldi/tools/protobuf/include -std=c++11 " /pykaldi/tools/clif_backend/llvm/tools/clif/python/utils/proto_util.clif
Traceback (most recent call last):
  File "/pykaldi/tools/clif_backend/llvm/tools/clif/pyclif.py", line 37, in <module>
    from clif.protos import ast_pb2
  File "/pykaldi/tools/clif_backend/build_matcher/tools/clif/protos/ast_pb2.py", line 22, in <module>
    serialized_pb=_b('\n\tast.proto\x12\x0b\x63lif.protos\"\xdf\x01\n\x03\x41ST\x12\x0e\n\x06source\x18\x01 \x01(\t\x12\x19\n\x11usertype_includes\x18\x02 \x03(\t\x12 \n\x05\x64\x65\x63ls\x18\x03 \x03(\x0b\x32\x11.clif.protos.Decl\x12\x12\n\nextra_init\x18\x04 \x03(\t\x12\x18\n\x10\x63\x61tch_exceptions\x18\x05 \x01(\x08\x12&\n\x08typemaps\x18\x06 \x03(\x0b\x32\x14.clif.protos.Typemap\x12\"\n\x06macros\x18\x07 \x03(\x0b\x32\x12.clif.protos.Macro\x12\x11\n\tdocstring\x18\x08 \x01(\t\"\xc9\x03\n\x04\x44\x65\x63l\x12(\n\x08\x64\x65\x63ltype\x18\x01 \x01(\x0e\x32\x16.clif.protos.Decl.Type\x12(\n\x06\x63lass_\x18\n \x01(\x0b\x32\x16.clif.protos.ClassDeclH\x00\x12%\n\x04\x65num\x18\x0b \x01(\x0b\x32\x15.clif.protos.EnumDeclH\x00\x12#\n\x03var\x18\x0c \x01(\x0b\x32\x14.clif.protos.VarDeclH\x00\x12\'\n\x05\x63onst\x18\r \x01(\x0b\x32\x16.clif.protos.ConstDeclH\x00\x12%\n\x04\x66unc\x18\x0e \x01(\x0b\x32\x15.clif.protos.FuncDeclH\x00\x12)\n\x05\x66\x64\x65\x63l\x18\x0f \x01(\x0b\x32\x18.clif.protos.ForwardDeclH\x00\x12\x10\n\x08\x63pp_file\x18\x02 \x01(\t\x12\x11\n\tnot_found\x18\x03 \x01(\t\x12\x13\n\x0bline_number\x18\x04 \x01(\x05\x12\x12\n\nnamespace_\x18\x05 \x01(\t\"P\n\x04Type\x12\x0b\n\x07UNKNOWN\x10\x00\x12\x08\n\x04\x45NUM\x10\x01\x12\x07\n\x03VAR\x10\x02\x12\t\n\x05\x43ONST\x10\x03\x12\x08\n\x04\x46UNC\x10\x04\x12\x08\n\x04TYPE\x10\x05\x12\t\n\x05\x43LASS\x10\x06\x42\x06\n\x04\x64\x65\x63l\"\xdb\x03\n\tClassDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\x12 \n\x05\x62\x61ses\x18\x02 \x03(\x0b\x32\x11.clif.protos.Name\x12\"\n\x07members\x18\x03 \x03(\x0b\x32\x11.clif.protos.Decl\x12\r\n\x05\x66inal\x18\x04 \x01(\x08\x12\x12\n\x06shared\x18\x05 \x01(\x08\x42\x02\x18\x01\x12\x1e\n\x10\x63pp_has_def_ctor\x18\x06 \x01(\x08:\x04true\x12\x1f\n\x17\x63pp_has_trivial_defctor\x18\x0c \x01(\x08\x12\x1c\n\x14\x63pp_has_trivial_dtor\x18\r \x01(\x08\x12!\n\x13\x63pp_has_public_dtor\x18\t \x01(\x08:\x04true\x12\x12\n\nasync_dtor\x18\n \x01(\x08\x12\x1a\n\x0c\x63pp_copyable\x18\x07 \x01(\x08:\x04true\x12\x14\n\x0c\x63pp_abstract\x18\x08 \x01(\x08\x12.\n\tcpp_bases\x18\x0b \x03(\x0b\x32\x1b.clif.protos.ClassDecl.Base\x12\x11\n\tdocstring\x18\x0e \x01(\t\x1a\x39\n\x04\x42\x61se\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x11\n\tnamespace\x18\x02 \x01(\t\x12\x10\n\x08\x66ilename\x18\x03 \x01(\t\"\x84\x01\n\x08\x45numDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\x12\"\n\x07members\x18\x02 \x03(\x0b\x32\x11.clif.protos.Name\x12\x12\n\nenum_class\x18\x03 \x01(\x08\x12\x1f\n\x04item\x18\x04 \x01(\x0b\x32\x11.clif.protos.Type\"\x9b\x01\n\x07VarDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\x12\x1f\n\x04type\x18\x02 \x01(\x0b\x32\x11.clif.protos.Type\x12&\n\x07\x63pp_get\x18\x03 \x01(\x0b\x32\x15.clif.protos.FuncDecl\x12&\n\x07\x63pp_set\x18\x04 \x01(\x0b\x32\x15.clif.protos.FuncDecl\"\\\n\tConstDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\x12\x1f\n\x04type\x18\x02 \x01(\x0b\x32\x11.clif.protos.Type\x12\r\n\x05value\x18\x03 \x01(\t\"|\n\tParamDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\x12\x1f\n\x04type\x18\x02 \x01(\x0b\x32\x11.clif.protos.Type\x12\x16\n\x0e\x63pp_exact_type\x18\x04 \x01(\t\x12\x15\n\rdefault_value\x18\x03 \x01(\t\"\x98\x03\n\x08\x46uncDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\x12&\n\x06params\x18\x02 \x03(\x0b\x32\x16.clif.protos.ParamDecl\x12\'\n\x07returns\x18\x03 \x03(\x0b\x32\x16.clif.protos.ParamDecl\x12\'\n\x07\x65xcepts\x18\x04 \x03(\x0b\x32\x16.clif.protos.Exception\x12\x10\n\x08postproc\x18\x05 \x01(\t\x12\x13\n\x0b\x63onstructor\x18\x12 \x01(\x08\x12\x13\n\x0b\x63lassmethod\x18\x0b \x01(\x08\x12\x13\n\x0bpy_keep_gil\x18\x0c \x01(\x08\x12\x0f\n\x07virtual\x18\x0f \x01(\x08\x12\x1b\n\x13ignore_return_value\x18\x10 \x01(\x08\x12\x17\n\x0f\x63pp_void_return\x18\r \x01(\x08\x12\x14\n\x0c\x63pp_noexcept\x18\x0e \x01(\x08\x12\x16\n\x0e\x63pp_opfunction\x18\x11 \x01(\x08\x12\x18\n\x10\x63pp_const_method\x18\x13 \x01(\x08\x12\x11\n\tdocstring\x18\x14 \x01(\t\".\n\x0b\x46orwardDecl\x12\x1f\n\x04name\x18\x01 \x01(\x0b\x32\x11.clif.protos.Name\"(\n\x04Name\x12\x0e\n\x06native\x18\x01 \x01(\t\x12\x10\n\x08\x63pp_name\x18\x02 \x01(\t\"\xc5\x02\n\x04Type\x12\x11\n\tlang_type\x18\x01 \x01(\t\x12\x10\n\x08\x63pp_type\x18\x02 \x01(\t\x12!\n\x06params\x18\x03 \x03(\x0b\x32\x11.clif.protos.Type\x12\'\n\x08\x63\x61llable\x18\x04 \x01(\x0b\x32\x15.clif.protos.FuncDecl\x12\x1e\n\x10\x63pp_has_def_ctor\x18\x05 \x01(\x08:\x04true\x12\x1a\n\x0c\x63pp_copyable\x18\x06 \x01(\x08:\x04true\x12\x17\n\x0f\x63pp_raw_pointer\x18\x07 \x01(\x08\x12\x1c\n\x14\x63pp_toptr_conversion\x18\x08 \x01(\x08\x12 \n\x18\x63pp_touniqptr_conversion\x18\x0b \x01(\x08\x12\x14\n\x0c\x63pp_abstract\x18\t \x01(\x08\x12!\n\x13\x63pp_has_public_dtor\x18\n \x01(\x08:\x04true\"*\n\tException\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x0f\n\x07\x63hecked\x18\x02 \x01(\x08\"3\n\tTypeTable\x12&\n\x08typemaps\x18\x01 \x03(\x0b\x32\x14.clif.protos.Typemap\"F\n\x07Typemap\x12\x11\n\tlang_type\x18\x01 \x01(\t\x12\x10\n\x08\x63pp_type\x18\x02 \x03(\t\x12\x16\n\x0epostconversion\x18\x03 \x01(\t\")\n\x05Macro\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x12\n\ndefinition\x18\x02 \x01(\x0c')
TypeError: __new__() got an unexpected keyword argument 'serialized_options'
ninja: build stopped: subcommand failed.

In file /pykaldi/tools/clif_backend/build_matcher/tools/clif/protos/ast_pb2.py, function call includes a parameter serialized_options=None, which is recently added to Protobuf.
protocolbuffers/protobuf@0400cca#diff-d6de91caf3b3a819abb247fae7f62938

from google.protobuf import descriptor as _descriptor

DESCRIPTOR = _descriptor.FileDescriptor(
  name='ast.proto',
  package='clif.protos',
  syntax='proto2',
  serialized_options=None,
  serialized_pb=_b('\n\tast.......
....

In Protobuf installation, google test is not pulled.

Adding

git submodule update --init --recursive

in protobuf directory does this.

Use cmake3 for installation on CentOS

CentOS package manager (yum) mirrors by default have cmake2.6 and use cmake3 for cmake ver >=3.0. Hence it would be useful to include a script that takes care of that, shouldn't be a problem, just replace cmake with cmake3 in the installation script for clif.

An example of a recipe implemented with pykaldi

Is there a place where I can find some standard recipes (say, TIMIT s5) reimplemented in python using pykaldi?

Build fails on most recent commit

[ 78%] Generating drawer-clifwrap.cc, drawer-clifwrap.h, drawer-clifwrap-init.cc

Line .123456789.123456789.123456789.123456789
1:from "fstext/lattice-weight-clifwrap.h" import *\n
2:from "fstext/symbol-table-clifwrap.h" import *\n
3:from "fstext/fst-clifwrap.h" import *\n
4:from "util/ios-clifwrap.h" import *\n
5:\n
6:from "fst/script/draw-impl.h":\n
7: namespace fst:\n
8:\n
9: class FstDrawer<StdArc> as StdFstDrawer:\n
10: def init(self, fst: StdFst, isyms: SymbolTable, osyms: SymbolTable,\n
11: ssyms: SymbolTable, accep: bool, title: str, width: float,\n
12: height: float, portrait: bool, vertical: bool,\n
13: ranksep: float, nodesep: float, fontsize: int,\n
14: precision: int, float_format: str, show_weight_one: bool)\n
15:\n
16: def Draw(self, strm: ostream, dest: str)\n
17:\n
18: class FstDrawer<LogArc> as LogFstDrawer:\n
19: def init(self, fst: LogFst, isyms: SymbolTable, osyms: SymbolTable,\n
20: ssyms: SymbolTable, accep: bool, title: str, width: float,\n
21: height: float, portrait: bool, vertical: bool,\n
22: ranksep: float, nodesep: float, fontsize: int,\n
23: precision: int, float_format: str, show_weight_one: bool)\n
24:\n
25: def Draw(self, strm: ostream, dest: str)\n
26:\n
27: class FstDrawer<ArcTpl<LatticeWeightTpl<float>>> as LatticeFstDrawer:\n
28: def init(self, fst: LatticeFst, isyms: SymbolTable,\n
29: osyms: SymbolTable, ssyms: SymbolTable,\n
30: accep: bool, title: str, width: float,\n
31: height: float, portrait: bool, vertical: bool,\n
32: ranksep: float, nodesep: float, fontsize: int,\n
33: precision: int, float_format: str, show_weight_one: bool)\n
34:\n
35: def Draw(self, strm: ostream, dest: str)\n
36:\n
37: class FstDrawer<ArcTpl<CompactLatticeWeightTpl<LatticeWeightTpl<float>,int32>>>\n
38: as CompactLatticeFstDrawer:\n
39: def init(self, fst: CompactLatticeFst, isyms: SymbolTable,\n
40: osyms: SymbolTable, ssyms: SymbolTable,\n
41: accep: bool, title: str, width: float,\n
42: height: float, portrait: bool, vertical: bool,\n
43: ranksep: float, nodesep: float, fontsize: int,\n
44: precision: int, float_format: str, show_weight_one: bool)\n
45:\n
46: def Draw(self, strm: ostream, dest: str)\n
_ParseError: include "util/ios-clifwrap.h" not found
kaldi/fstext/CMakeFiles/drawer_pyclif.dir/build.make:62: recipe for target 'kaldi/fstext/drawer-clifwrap.cc' failed
make[2]: *** [kaldi/fstext/drawer-clifwrap.cc] Error 3
CMakeFiles/Makefile2:2459: recipe for target 'kaldi/fstext/CMakeFiles/drawer_pyclif.dir/all' failed
make[1]: *** [kaldi/fstext/CMakeFiles/drawer_pyclif.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

setup.py install fail

while I am trying to run the setup.py install script (on python 3.6, ubuntu 18.04), I got this error:

Traceback (most recent call last):
File "setup.py", line 44, in
PYCLIF = check_output(['which', 'pyclif'])
File "setup.py", line 19, in check_output
return subprocess.check_output(*args, **kwargs).decode("utf-8").strip()
File "/home/madarb/anaconda3/envs/SPEAKER/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/home/madarb/anaconda3/envs/SPEAKER/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['which', 'pyclif']' returned non-zero exit status 1.

The protobuf, clif and kaldi installed successfully.

I would like to get some help with this error, thanks!

install issue

Traceback (most recent call last):
File "setup.py", line 44, in
PYCLIF = check_output(['which', 'pyclif'])
File "setup.py", line 19, in check_output
return subprocess.check_output(*args, **kwargs).decode("utf-8").strip()
File "/anaconda3/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/anaconda3/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['which', 'pyclif']' returned non-zero exit status 1.

can I change the kaldi-source when compile the tool to support online and nnet?

hello,I have read the following introduction:

Coverage Status
The following table shows the status of each PyKaldi package (we currently do not plan to add support for nnet, nnet2 and online) along the following dimensions

if i need to use the part of online and nnet,could i just change the kaldi source path to recompile it?

thanks

Build fails in most recent commit

Build fails for commit 508be78 on cu-sparse-matrix.

Build log:

[ 20%] Generating cu-sparse-matrix-clifwrap.cc, cu-sparse-matrix-clifwrap.h, cu-sparse-matrix-clifwrap-init.cc
cd /root/pykaldi/build/kaldi/cudamatrix && /root/opt/clif/bin/pyclif --matcher_bin=/root/opt/clif/clang/bin/clif-matcher --ccdeps_out /root/pykaldi/build/kaldi/cudamatrix/cu-sparse-matrix-clifwrap.cc --header_out /root/pykaldi/build/kaldi
/cudamatrix/cu-sparse-matrix-clifwrap.h --ccinit_out /root/pykaldi/build/kaldi/cudamatrix/cu-sparse-matrix-clifwrap-init.cc --modname=_cu_sparse_matrix --prepend=clif/python/types.h -I/root/pykaldi -I/root/pykaldi/kaldi -I/root/pykaldi/bu
ild/kaldi -I/root/kaldi/src "-f-I/usr/include/python2.7              -I/root/pykaldi              -I/root/pykaldi/kaldi              -I/root/pykaldi/build/kaldi              -I/root/kaldi/src              -I/root/kaldi/tools/openfst/inclu
de              -I/root/kaldi/tools/ATLAS/include              -I/clang/lib/clang/5.0.0/include               -std=c++11 -I.. -I/root/kaldi/tools/openfst/include -Wall -Wno-sign-compare -Wno-unused-local-typedefs -Wno-deprecated-declarati
ons -Winit-self -DKALDI_DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -I/root/kaldi/tools/ATLAS_headers/include -msse -msse2 -pthread -g -fPIC" /root/pykaldi/kaldi/cudamatrix/cu-sparse-matrix.clif
In file included from /dev/stdin:5:
/root/pykaldi/build/kaldi/cudamatrix/cu-array-clifwrap.h:36:56: error: no member named 'Int32Pair' in the global namespace
bool Clif_PyObjAs(PyObject* input, ::kaldi::CuArray< ::Int32Pair>** output);
                                                     ~~^
/root/pykaldi/build/kaldi/cudamatrix/cu-array-clifwrap.h:37:72: error: no member named 'Int32Pair' in the global namespace
bool Clif_PyObjAs(PyObject* input, std::shared_ptr<::kaldi::CuArray< ::Int32Pair>>* output);
                                                                     ~~^
/root/pykaldi/build/kaldi/cudamatrix/cu-array-clifwrap.h:37:92: error: expected a type
bool Clif_PyObjAs(PyObject* input, std::shared_ptr<::kaldi::CuArray< ::Int32Pair>>* output);
                                                                                           ^
/root/pykaldi/build/kaldi/cudamatrix/cu-array-clifwrap.h:37:92: error: expected ')'
/root/pykaldi/build/kaldi/cudamatrix/cu-array-clifwrap.h:37:18: note: to match this '('
bool Clif_PyObjAs(PyObject* input, std::shared_ptr<::kaldi::CuArray< ::Int32Pair>>* output);
                 ^
/root/pykaldi/build/kaldi/cudamatrix/cu-array-clifwrap.h:38:72: error: no member named 'Int32Pair' in the global namespace
bool Clif_PyObjAs(PyObject* input, std::unique_ptr<::kaldi::CuArray< ::Int32Pair>>* output);

This was tested on a clean installation (docker container)

test failure

I installed pykaldi on ubuntu 16.04. Installation went all well without any errors. But when I ran the
command python setup.py test, following error was reported
running test
running egg_info
writing requirements to pykaldi.egg-info/requires.txt
writing pykaldi.egg-info/PKG-INFO
writing top-level names to pykaldi.egg-info/top_level.txt
writing dependency_links to pykaldi.egg-info/dependency_links.txt
reading manifest file 'pykaldi.egg-info/SOURCES.txt'
writing manifest file 'pykaldi.egg-info/SOURCES.txt'
running build_ext
Using PYCLIF: /home/pankaj/asr/pykaldi/env/bin/pyclif
Using CLIF_MATCHER: /home/pankaj/asr/pykaldi/env/clang/bin/clif-matcher
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pankaj/asr/pykaldi/build
ninja: no work to do.

copying build/lib/kaldi/_clif.so -> kaldi
error: can't copy 'build/lib/kaldi/asr.so': doesn't exist or not a regular file

Rescoring output with grammar

Current Kaldi source code includes a binary lattice-lmrescore which enables to run the recognition with the original HCLG.fst, and then re-align the results of recognition according to new grammar G.fst.
Does your package have some similar functionalities, and if not, what could be the possible solution using your package?

I'm aware that it's not an issue, but I don't know any other places to pose the question :)

Calling TrainingGraphCompiler(...) gives segmentation fault error

Calling TrainingGraphCompiler(trans_model, ctx_dep, lex_fst, disambig_syms, opts) in kaldi.decoder gives a segmentation fault (core dump) error.

My code is as simple as follow:

    # loading models
    model_in_filename = "final.mdl"
    decode_fst_filename = "HCLG.fst"
    tree_filename = "tree"
    lex_fst_filename = "L.fst"
    disambig_filename = "disambig.int"
    
    trans_model = TransitionModel()
    with xopen(model_in_filename) as ki:
        trans_model.read(ki.stream(), ki.binary)
    
    ctx_dep = ContextDependency()
    with xopen(tree_filename) as ki:
        ctx_dep.read(ki.stream(), ki.binary)
    
    lex_fst = StdVectorFst.read(lex_fst_filename)

    disambig_syms = []
    with open(disambig_filename) as f:
        for line in f.read().splitlines():
            disambig_syms.append(int(line))

    gopts = TrainingGraphCompilerOptions()
    gopts.transition_scale = 0.0
    gopts.self_loop_scale = 0.0

    # Initializing train graph compiler
    TrainingGraphCompiler(trans_model, ctx_dep, lex_fst, disambig_syms, gopts)

I didn't have a problem loading the parameters, and I have confirmed that it is this initialization function that is causing the error. Does anyone know why it happens and how to fix it?

Most efficient way to get Numpy array from reader?

Hi,

I want to read in integer vectors and manipulate them as Numpy arrays. I noticed that if I use rxfilenames and use kaldi.util.io.read_vector I get a Vector object on which I can call numpy() efficiently. But if I use SequentialIntVectorReader, the value returned is a list. Would calling Vector on that list copy the underlying memory? What's the most efficient way to handle this use case?

Thanks for writing PyKaldi!

Build fails on lisa

Attached build.txt

Maybe kaldi on lisa was build w/o cuda (HAVE_CUDA=0), so the class does not get defined and clif cannot find it.

Error:

/home/victor/Workspace/pykaldi/kaldi/cudamatrix/cu-device.clif:5:8: error: no type named 'CuTimer' in
      namespace 'kaldi'; did you mean 'Timer'?
kaldi::CuTimer
~~~~~~~^~~~~~~
       Timer
/home/dogan/tools/kaldi/src/base/timer.h:63:7: note: 'Timer' declared here
class Timer {
      ^
/home/victor/Workspace/pykaldi/kaldi/cudamatrix/cu-device.clif:8:8: error: no type named 'CuDevice' in
      namespace 'kaldi'
kaldi::CuDevice
~~~~~~~^
/home/victor/Workspace/pykaldi/kaldi/cudamatrix/cu-device.clif:10:71: error: base specifier must name a
      class
template<class clif_unused_template_arg_1> class clif_class_1: public clif_type_1 { public:
                                                               ~~~~~~~^~~~~~~~~~~
/home/victor/Workspace/pykaldi/kaldi/cudamatrix/cu-device.clif:9:8: error: no type named 'CuDevice' in
      namespace 'kaldi'
kaldi::CuDevice
~~~~~~~^
_BackendError: Matcher failed with status 1
kaldi/cudamatrix/CMakeFiles/cu_device_pyclif.dir/build.make:62: recipe for target 'kaldi/cudamatrix/cu-device-clifwrap.cc' failed
make[2]: *** [kaldi/cudamatrix/cu-device-clifwrap.cc] Error 4
CMakeFiles/Makefile2:1774: recipe for target 'kaldi/cudamatrix/CMakeFiles/cu_device_pyclif.dir/all' failed

NnetSAD and NnetLatticeFasterOnlineRecognizer changes MFCC computation

Hi @dogancan !
I am computing MFCC features using kaldi.feat.mfcc.compute_feature method to perform speech activity detection on a wav test file.

I first realized that creating a SAD object through kaldi.segmentation.NnetSAD before computing MFCC changed the output of kaldi.feat.mfcc.compute_feature even if I am not using this object during the computation. Second I also noticed the same behavior when calling first kaldi.asr.NnetLatticeOnlineRecognizer.

Here is my code:

#!/usr/bin/env python3.6

from __future__ import print_function

from kaldi.matrix import Vector
from kaldi.feat.mfcc import Mfcc, MfccOptions
from kaldi.util.options import ParseOptions

import os
import codecs
import wave
import struct

from kaldi.asr import NnetLatticeFasterOnlineRecognizer
from kaldi.segmentation import NnetSAD

# Construct recognizer
asr = NnetLatticeFasterOnlineRecognizer.from_files(
 	"decode/final.mdl", "decode/HCLG.fst", "decode/words.txt")


# Construct SAD
model = NnetSAD.read_model("SAD/exp/segmentation_1a/tdnn_stats_asr_sad_1a/final.raw")
post = NnetSAD.read_average_posteriors("SAD/exp/segmentation_1a/tdnn_stats_asr_sad_1a/post_output.vec")
transform = NnetSAD.make_sad_transform(post)
graph = NnetSAD.make_sad_graph()
sad = NnetSAD(model, transform, graph)


# MFCC config
mfcc_usage = """Extract MFCC features.
		   Usage:  example.py [opts...] <rspec> <wspec>
		"""

po_mfcc = ParseOptions(mfcc_usage)
po_mfcc.register_float("min-duration", 0.0,
				  "minimum segment duration")
mfcc_opts = MfccOptions()
mfcc_opts.frame_opts.samp_freq = 16000
mfcc_opts.use_energy = False
mfcc_opts.num_ceps = 40
mfcc_opts.mel_opts.num_bins = 40
mfcc_opts.mel_opts.high_freq = -400
mfcc_opts.mel_opts.low_freq = 20

mfcc_opts.register(po_mfcc)

mfcc = Mfcc(mfcc_opts)
sf = mfcc_opts.frame_opts.samp_freq



wf = wave.open('SAD/audio/test.short.wav','rb')
input_frames = wf.getnframes()
chunk = 1600
num_frames = 0
mfcc_pykaldi_file = codecs.open('mfcc_pykaldi.txt', 'w', encoding = 'utf-8')

# MFCC computation
while num_frames < input_frames:
	if (num_frames + chunk) < input_frames:
		nframes = chunk
	else:
		nframes = input_frames - num_frames

	data = wf.readframes(nframes)
	data_struct = struct.unpack_from('<%dh' % 1600, data)
	
	num_frames += nframes
	data_vec = Vector(data_struct)
	f = mfcc.compute_features(data_vec, sf, 1.0)
	for i in range(len(f)):
		mfcc_pykaldi_file.write(str(f[i]))

As you can see, the asr and sad objects should be useless but if you comment

asr = NnetLatticeFasterOnlineRecognizer.from_files( "decode/final.mdl", "decode/HCLG.fst", "decode/words.txt")

sad = NnetSAD(model, transform, graph)

or both of them, you'll end up with different MFCC vectors.

In the table below are three MFCC vectors (40-dim) corresponding to the three cases above, computed on the first 25ms of my test file.

No asr & no sad	asr	sad
68.1835	68.1532	68.1784
-20.2310	-19.8629	-20.0390
-16.3968	-16.7663	-16.6460
-8.0430	-7.7936	-7.8147
-21.9416	-20.9534	-21.1954
5.0883	3.9446	3.9356
-4.5188	-3.4223	-4.6011
7.2100	6.8197	8.3315
26.5404	24.0173	24.6763
14.2824	16.6121	16.5472
-6.4256	-8.7899	-9.3176
-18.3546	-18.5420	-18.3494
4.1023	2.2490	2.4281
31.9681	33.4585	35.2552
17.3717	14.2230	12.8806
-25.9906	-23.7150	-23.2355
4.7358	3.2480	2.9003
9.3317	9.9033	11.1418
-4.1933	-4.5594	-5.8049
-3.6752	-3.0705	-2.1171
1.2225	1.0135	0.6585
-0.3392	-0.5516	-0.4541
0.0423	0.2120	0.1225
-0.0437	0.0185	-0.0353
2.3822	2.1908	2.4821
-0.6596	-0.6640	-0.5034
-3.5008	-3.5127	-3.7470
1.6946	1.6432	1.6080
-0.4791	-1.0352	-0.8828
-7.6661	-6.9776	-6.7119
-0.8695	-1.6926	-1.8929
-3.5768	-3.0292	-3.0005
1.0912	0.7228	-0.1199
4.4466	4.5943	5.3442
-1.5589	-1.6910	-2.5004
9.1514	9.6523	10.4633
1.5344	0.6773	0.3549
-2.8008	-2.7846	-3.0110
1.3539	1.7080	1.2172
-3.7193	-3.8221	-3.3396

Any idea about why this is happening?

Py_InitModule3 raises an error with Python 3

Building fails with python 3.6.1 (conda): ‘Py_InitModule3’ was not declared in this scope

PyObject* module = Py_InitModule3("kaldi_vector_numpy", Methods, "Vector/ndarray conversion wrapper");

In function PyObject* initkaldi_vector_numpy()

Complete log can be found in /home/victor/build.log

[Question] How to get the numpy from compute_kaldi_pitch?

I'm a new of pykalid.

I understand that Matrix class has numpy method, but got following error.

import numpy
from kaldi.matrix import Vector
from kaldi.feat.pitch import compute_and_process_kaldi_pitch
from kaldi.feat.pitch import ProcessPitchOptions
from kaldi.feat.pitch import PitchExtractionOptions

wave = Vector(numpy.random.randint(-128, 128, 1000))
x = compute_and_process_kaldi_pitch(PitchExtractionOptions(), ProcessPitchOptions(), wave)
print(x.numpy())

AttributeError: '_kaldi_matrix.Matrix' object has no attribute 'numpy'

error acessing weights of gmm

hi, I'am trying to access weights of trained gmm model. With this code

from kaldi.gmm import FullGmm
gmm = FullGmm()
with xopen('final.ubm') as f:
    gmm.read(f.stream(), f.binary)
gmm.weights()

but get this error below. I am using latest master kaldi version. Can you help please tell how to acess gmm weights?

Error:

>NameError                                 Traceback (most recent call last)
~/anaconda/envs/kaldi/lib/python3.5/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

>~/anaconda/envs/kaldi/lib/python3.5/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    393                             if callable(meth):
    394                                 return meth(obj, self, cycle)
--> 395             return _default_pprint(obj, self, cycle)
    396         finally:
    397             self.end_group()

>~/anaconda/envs/kaldi/lib/python3.5/site-packages/IPython/lib/pretty.py in _default_pprint(obj, p, cycle)
    508     if _safe_getattr(klass, '__repr__', None) is not object.__repr__:
    509         # A user-provided repr. Find newlines and replace them with p.break_()
--> 510         _repr_pprint(obj, p, cycle)
    511         return
    512     p.begin_group(1, '<')

>~/anaconda/envs/kaldi/lib/python3.5/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    699     """A pprint that just redirects to the normal repr function."""
    700     # Find newlines and replace them with p.break_()
--> 701     output = repr(obj)
    702     for idx,output_line in enumerate(output.splitlines()):
    703         if idx:

>~/anaconda/envs/kaldi/lib/python3.5/site-packages/pykaldi-0.0.9-py3.5-linux-x86_64.egg/kaldi/matrix/__init__.py in __repr__(self)
    553 
    554     def __repr__(self):
--> 555         return str(self)
    556 
    557     def __str__(self):

>~/anaconda/envs/kaldi/lib/python3.5/site-packages/pykaldi-0.0.9-py3.5-linux-x86_64.egg/kaldi/matrix/__init__.py in __str__(self)
    562         # https://github.com/pytorch/pytorch/blob/master/torch/tensor.py
    563         if sys.version_info > (3,):
--> 564             return _str._vector_str(self)
    565         else:
    566             if hasattr(sys.stdout, 'encoding'):

>NameError: name '_str' is not defined"

TypeError: Resize() argument resize_type is not valid for ::kaldi::MatrixResizeType

MWE:

from kaldi.matrix import *
m = Matrix(5,5)

Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-2923c99f9618> in <module>()
----> 1 m = Matrix(5,5)

~/Workspace/pykaldi/kaldi/matrix/__init__.py in __init__(self, num_rows, num_cols, src, row_start, col_start)
    328                         raise TypeError("num_rows and num_cols should both be "
    329                                         "positive or they should both be 0.")
--> 330                 self.resize_(num_rows, num_cols, MatrixResizeType.UNDEFINED)
    331         else:
    332             if isinstance(src, kaldi_matrix.MatrixBase):

~/Workspace/pykaldi/kaldi/matrix/__init__.py in resize_(self, num_rows, num_cols, resize_type, stride_type)
    395         """Sets matrix to the specified size."""
    396         if self.own_data:
--> 397             self.Resize(num_rows, num_cols, resize_type, stride_type)
    398         else:
    399             raise ValueError("resize_ method cannot be called on "

TypeError: Resize() argument resize_type is not valid for ::kaldi::MatrixResizeType (MatrixResizeType given): expecting enum MatrixResizeType, got MatrixResizeType

Using:

Python 3.6.1 :: Continuum Analytics, Inc.

Also,

In [2]: type(matrix_common.MatrixResizeType)
Out[2]: enum.EnumMeta

In [6]: type(matrix_common.MatrixResizeType.SET_ZERO)
Out[6]: <enum 'MatrixResizeType'>

pykaldi / pykaldi Goto Github PK

pykaldi's Introduction

Overview

Getting Started

Automatic Speech Recognition in Python

Offline ASR using Kaldi Models

Offline ASR using a PyTorch Acoustic Model

Online ASR using Kaldi Models

Lattice Rescoring with a Kaldi RNNLM

About PyKaldi

Coverage Status

Installation

Pip / whl packages

From Source

Step 1: Clone PyKaldi Repository and Create a New Python Environment

Step 2: Install Dependencies

Step 3: Install PyKaldi

Starting a new project with a pykaldi whl package

Conda

Docker

FAQ

How do I prevent PyKaldi install command from exhausting the system memory?

How do I build PyKaldi on Windows?

How do I build PyKaldi using a different Kaldi installation?

How do I build PyKaldi using a different CLIF installation?

How do I update Protobuf, CLIF or Kaldi used by PyKaldi?

How do I build PyKaldi with Tensorflow RNNLM support?

Projects using PyKaldi

Citing

Contributing

pykaldi's People

Contributors

Stargazers

Watchers

Forkers

pykaldi's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs