GithubHelp home page GithubHelp logo

mycroftai / mimic1 Goto Github PK

View Code? Open in Web Editor NEW
785.0 62.0 150.0 711.75 MB

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)

Home Page: https://mimic.mycroft.ai

License: Other

Makefile 0.01% C 99.91% C++ 0.01% Shell 0.03% M4 0.01% Python 0.01% Lex 0.01% Perl 0.01% Scheme 0.04%

mimic1's Introduction

Mimic - The Mycroft TTS Engine

Build Status codecov.io Coverity Scan

Mimic is a fast, lightweight Text-to-speech engine developed by Mycroft A.I. and VocaliD, based on Carnegie Mellon University’s Flite (Festival-Lite) software. Mimic takes in text and reads it out loud to create a high quality voice.

Official project site: mimic.mycroft.ai

Supported platforms

  • Linux (ARM & Intel architectures)
  • Mac OS X
  • Windows

Untested

  • Android

Future

  • iOS

Requirements

This is the list of requirements. Below there is the commands needed on the most popular distributions and supported OS.

  • A good C compiler:
    • Linux or Mac OSX: Recommended: gcc or clang
    • Windows: Recommended: GCC under Cygwin or mingw32
  • GNU make, automake and libtool
  • pkg-config
  • Optionally, PCRE2 library and headers (they are compiled otherwise)
  • An audio engine:
    • Linux: ALSA/PortAudio/PulseAudio (Recommended: ALSA)
    • Mac OSX: PortAudio
    • Windows: PortAudio

Linux

On Debian/Ubuntu
$ sudo apt-get install gcc make pkg-config automake libtool libasound2-dev
On Fedora
$ sudo dnf install gcc make pkgconfig automake libtool alsa-lib-devel
On Arch
$ sudo pacman -S --needed install gcc make pkg-config automake libtool alsa-lib

Mac OSX

  • Install Brew

    $ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    
  • Install pkg-config, automake, libtool, pcre2 and PortAudio

    $ brew install pkg-config automake libtool portaudio pcre2
    

Windows

Cross compiling:

The fastest and most straightforward way to build mimic for windows is by cross-compilation from linux. This requires some additional packages to be installed.

On Ubuntu 18.04 (bionic):

sudo apt-get install gcc make pkg-config automake libtool libpcre2-dev wine-stable binutils-mingw-w64-i686 mingw-w64-i686-dev gcc-mingw-w64-i686

On Ubuntu 16.04 (xenial):

sudo apt-get install gcc make pkg-config automake libtool libpcre2-dev wine binutils-mingw-w64-i686 mingw-w64-i686-dev gcc-mingw-w64-i686

On Ubuntu 14.04 (trusty):

sudo apt-get install gcc make pkg-config automake libtool mingw32 mingw32-runtime wine

Native Windows building

  • Audio device and audio libraries are optional, as mimic can write its output to a waveform file.
  • Some of the source files are quite large, that some C compilers might choke on these. So, gcc is recommended.
  • Visual C++ 6.0 is known to fail on the large diphone database files
  • The build process is MUCH slower on Windows.

Build

On a native build (not cross-compilation)

  • Clone the repository

    $ git clone https://github.com/MycroftAI/mimic1.git
    
  • Navigate to mimic directory

    $ cd mimic1
    
  • Build and install missing dependencies (pcre2)

    $ ./dependencies.sh --prefix="/usr/local"
    
  • Generate mimic build scripts

    $ ./autogen.sh
    
  • Configure.

    $ ./configure --prefix="/usr/local"
    
  • Build

    $ make
    
  • Check

    $ make check
    

Cross compilation:

  • Run the windows build script:
./run_testsuite.sh winbuild
  • Test it: The directory install will contain bin/mimic.exe file
wine ./mimic.exe -t "hello world" 
  • Distribute it

You can distribute the compiled mimic by adding to a zip file everything in the install/winbuild/bin directory.

Usage

By default mimic will play the text using an audio device. Alternatively it can output the wave file in RIFF format (often called .wav).

Read text

  • To an audio device

    $ ./mimic -t TEXT
    

    Example

    $ ./mimic -t "Hello. Doctor. Name. Continue. Yesterday. Tomorrow."
    
  • To an audio file

    $ ./mimic -t TEXT -o WAVEFILE
    

    Example

    $ ./mimic -t "Hello. Doctor. Name. Continue. Yesterday. Tomorrow." -o hello.wav
    

Read text from file

  • To an audio device

    $ ./mimic -f TEXTFILE
    

    Example

    $ ./mimic -f doc/alice
    
  • To an audio file

    $ ./mimic -f TEXTFILE -o WAVEFILE`
    

    Example

    $ ./mimic -f doc/alice -o hello.wav
    

Change voice

  • List available internal voices

    $ ./mimic -lv
    
  • Use an internal voice

    $ ./mimic -t TEXT -voice VOICE
    

    Example

    $ ./mimic -t "Hello" -voice slt
    
  • Use an external voice file

    $ ./mimic -t TEXT -voice VOICEFILE
    

    Example

    $ ./mimic -t "Hello" -voice voices/cmu_us_slt.flitevox
    
  • Use an external voice url

    $ ./mimic -t TEXT -voice VOICEURL
    

    Example

    $ ./mimic -t "Hello" -voice http://www.festvox.org/flite/packed/flite-2.0/voices/cmu_us_ksp.flitevox
    
Notes
  • mimic offers several voices that can use different speech modelling techniques (diphone, clustergen, hts). Voices can differ a lot on size, naturalness and intelligibility.

    • Diphone voices are less computationally expensive and quite intelligible but they lack naturalness (sound more robotic). e.g. ./mimic -t "Hello world" -voice kal16

    • clustergen voices can sound more natural and intelligible at the expense of size and computational requirements. e.g.: e.g. ./mimic -t "Hello world" -voice slt, ./mimic -t "Hello world" -voice ap

    • hts voices usually may sound a bit more synthetic than clustergen voices, but have much smaller size. e.g.: e.g. ./mimic -t "Hello world" -voice slt_hts

  • Voices can be compiled (built-in) into mimic or loaded from a .flitevox file. The only exception are hts voices. hts voices combine both a compiled function with a voice data file .htsvoice. Mimic will look for the .htsvoice file when the hts voice is loaded, looking into the current working directory, the "voices" subdirectory and the $prefix/share/mimic/voices directory if it exists.

  • Voice names are identified as loadable files if the name includes a "/" (slash) otherwise they are treated as internal compiled-in voices.

  • The voices/ directory contains several flitevox voices. Existing Flite voices can be found here: http://www.festvox.org/flite/packed/flite-2.0/voices/

  • The voice referenced via an url will be downloaded on the fly.

Other options

Voices accept additional debug options. specified as --setf feature=value in the command line. Wrong values can prevent mimic from working. Some speech modelling techniques may not implement support for changing these features so at some point some voices may not provide support for these options. Here are some examples:

  • Use simple concatenation of diphones without prosodic modification

    ./mimic --sets join_type=simple_join doc/intro.txt
    
  • Print sentences as they are said

    ./mimic -pw doc/alice
    
  • Make it speak slower

    ./mimic --setf duration_stretch=1.5 doc/alice
    
  • Make it speak faster

    ./mimic --setf duration_stretch=0.8 doc/alice
    
  • Make it speak higher

    ./mimic --setf int_f0_target_mean=145 doc/alice
    

See lang/cmu_us_kal/cmu_us_kal.c) to see some other features and values.

Say the hour

  • The talking clock requires a single argument HH:MM. Under Unix you can call it
    ./mimic_time `date +%H:%M` 
    

Benchmarking

  • For benchmarking, "none" can be used to discard the generated audio and give a summary of the speed:
    ./mimic -f doc/alice none
    

How to Contribute

For those who wish to help contribute to the development of mimic there are a few things to keep in mind.

Git branching structure

We will be using a branching struture similar to the one described in this article

In short
  • master branch is for stable releases,

  • development branch is where development work is done between releases,

  • Any feature branch should branch off from development, and when complete will be merged back into development.

  • Once enough features are added or a new release is complete those changes in development will be merged into master, then work can continue on development for the next release.

Coding Style Requirements

To keep the code in mimic coherent a simple coding style/guide is used. It should be noted that the current codebase as a whole does not meet some of these guidlines,this is a result of coming from the flite codebase. As different parts of the codebase are touched, it is the hope that these inconsistancies will diminish as time goes on.

  • Indentation

    Each level of indentation is 4 spaces.

  • Braces

    Braces always comes on the line following the statement.

    Example

    void cool_function(void)
    {
        int cool;
        for (cool = 0; cool < COOL_LIMIT; cool++)
        {
            [...]
            if (cool == AWESOME)
            {
                [...]
            }
        }
    }
  • If-statements

    Always use curly braces.

    Example

    if(condition)
    {                             /*always use curly braces even if the 'if' only has one statement*/
        DoJustThisOneThing();        
    }
    
    if(argv[i][2] == 'h' &&      /*split 'if' conditions to multiple lines if the conditions are long */
       argv[i][3] == 'e' &&      /*or if it makes things more readable. */
       argv[i][4] == 'l' && 
       argv[i][5] == 'p')
    {
          /*example taken from args parsing code*/
          /* code */
    }
    else if(condition)
    {
          /* code */
    }
    else
    {
        /* code */
    }
  • Switch-statements

    Always keep the break statement last in the case, after any code blocks.

    Example

    switch(state)
    {
        case 1:
        {               /* even if the case only has one line, use curly braces (similar reasoning as with if's) */ 
            doA(1);
        } break;
                            /* separate cases with a line */
        case 2:             /* unless it falls into the next one */
        case 3:
        {
            DoThisFirst();
        }                   /* no break, this one also falls through */
        case 4:
        {                   /* notice that curly braces line up with 'case' on line above */
            int b = 2;
            doA(b);
        } break;        /* putting 'break' on this line saves some room and makes it look a little nicer */
    
        case 5:
        {
            /* more code */
        } break;
    
        default:        /* It is nice to always have a default case, even if it does nothing */
        {
            InvalidDefaultCase(); /* or whatever, it depends on what you are trying to do. */
        }
    }
  • Line length

    There's no hard limit but if possible keep lines shorter than 80 characters.

Vimrc

For those of you who use vim, add this to your vimrc to ensure proper indenting.

"####Indentation settings
:filetype plugin indent on
" show existing tab with 4 spaces width
:set tabstop=4
" when indenting with '>', use 4 spaces width
:set shiftwidth=4
" On pressing tab, insert 4 spaces
:set expandtab
" fix indentation problem with types above function name
:set cinoptions+=t0
" fix indentation of { after case
:set cinoptions+==0
" fix indentation of multiline if
:set cinoptions+=(0   "closing ) to let vimrc hylighting work after this line

"see http://vimdoc.sourceforge.net/htmldoc/indent.html#cinoptions-values
"for more indent options
Indent command (currently does not indent switch/cases properly)
indent [FILE] -npcs -i4 -bl -Tcst_wave -Tcst_wave_header -Tcst_rateconv \
      -Tcst_voice -Tcst_item -Tcst_features -Tcst_val -Tcst_va -Tcst_viterbi \
      -Tcst_utterance -Tcst_vit_cand_f_t -Tcst_vit_path_f_t -Tcst_vit_path \
      -Tcst_vit_point -Tcst_string -Tcst_lexicon -Tcst_relation \
      -Tcst_voice_struct -Tcst_track -Tcst_viterbi_struct -Tcst_vit_cand \
      -Tcst_tokenstream -Tcst_tokenstream_struct -Tcst_synth_module \
      -Tcst_sts_list -Tcst_lpcres -Tcst_ss -Tcst_regex -Tcst_regstate \
      -Twchar_t -Tcst_phoneset -Tcst_lts_rewrites -Tlexicon_struct \
      -Tcst_filemap -Tcst_lts_rules -Tcst_clunit_db -Tcst_cg_db \
      -Tcst_audio_streaming_info -Tcst_audio_streaming_info_struct -Tcst_cart \
      -Tcst_audiodev -TVocoderSetup -npsl -brs -bli0 -nut

Acknowledgements

see ACKNOWLEDGEMENTS

License

See COPYING

mimic1's People

Contributors

aatchison avatar anselm94 avatar cooljimy84 avatar dimstar77 avatar earboxer avatar ffontaine avatar forslund avatar ivuk avatar kathyreid avatar krisgesling avatar longboolean avatar m-toman avatar marctreysonos avatar noamdev avatar puretryout avatar rhdunn avatar shifubear avatar simonmicro avatar trasz avatar vitaly-zdanevich avatar waffle-iron avatar zeehio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mimic1's Issues

Continuous integration

I would like to add to mimic continuous integration infrastructure.

I would like on every commit to:

  • Know if all the tests pass
  • Know the percentage of code covered by tests
  • Have a static code analysis report looking for potential issues
  • Have a report of other checks (code style...)

To achieve that I plan to:

  • Use Travis CI (To build the project)
  • Use coverity for static code analysis
  • Use code climate for the rest

I have some experience with these tools but before implementing this I would like to know:

  • If any of you know of better or other tools
  • If MycroftAI as an organization is happy to have that setup
  • Your opinion, comments and feedback

new_cst_uregex is locale dependent

PASS: unittests/hrg_test
PASS: unittests/regex_test
FAIL: unittests/string_test
PASS: unittests/token_test
PASS: unittests/voice_select
PASS: unittests/wave_test
PASS: unittests/lex_test
PASS: unittests/lts_test
PASS: unittests/nums_test
============================================================================
Testsuite summary for mimic 1.2.0.2
============================================================================
# TOTAL: 9
# PASS:  8
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0
============================================================================
See ./test-suite.log
Please report to https://github.com/MycroftAI/mimic/issues
============================================================================
make[3]: *** [Makefile:4225: test-suite.log] Error 1
make[3]: Leaving directory '/tmp/yaourt-tmp-stardiviner/aur-mimic/src/mimic-1.2.0.2'
make[2]: *** [Makefile:4333: check-TESTS] Error 2
make[2]: Leaving directory '/tmp/yaourt-tmp-stardiviner/aur-mimic/src/mimic-1.2.0.2'
make[1]: *** [Makefile:4618: check-am] Error 2
make[1]: Leaving directory '/tmp/yaourt-tmp-stardiviner/aur-mimic/src/mimic-1.2.0.2'
make: *** [Makefile:4110: check-recursive] Error 1
make: Target 'check' not remade because of errors.
==> ERROR: A failure occurred in check().
    Aborting...
==> ERROR: Makepkg was unable to build mimic.

Seems one testing is not passed.

Name clash: There is an unarelated shared library named "libmimic.so"

From synaptic, the package manager:

libmimic is an open source video decoding library for decoding Mimic V2.x-
encoded content (fourCC: ML20), which is the encoding used by MSN Messenger
for webcam conversations.

The codecs metapackage mint-meta-codecs on Linux Mint installs it, as a dependency of some of the gstreamer multimedia plugins.

I suggest renaming mimic libraries from libmimic to libttsmimic, any other ideas?

(I was going crazy until I found out I was not linking to the right library!)

OSS detection

After setting up a virtual machine without alsa or portaudio OSS is detected as audio despite that no OSS support actually exist. And when running mimic the error message "oss_audio: failed to open audio device /dev/dsp is shown.

On modern linux boxes OSS can still be used, there is an alsa compatibility layer and Arch linux has a guide for installing OSS as a native driver. If we intend to keep the OSS support (for linux) some more advanced detection than

AC_CHECK_HEADER(sys/soundcard.h,
              [AUDIODRIVER="oss"
               AUDIODEFS=-DCST_AUDIO_LINUX])

is needed.

One possibility is to check if /dev/dsp actually exists. It is created by the alsa-oss compatibility layer and is the hard coded sound deviced used by mimic's OSS layer. It does feel a bit dirty...

Consider switching ICU to pcre

pcre2 has a smaller scope (just regular expressions) and supports UTF-8. It is easier and faster to build than ICU and as it is a C library we would not need a C++ compiler, for instance on cross-compilation.

Not only that, but also given its size we can consider embedding a copy of the pcre library in mimic and, in case the library is not found, compile it.

pcre is used by nginx, julia and R among other projects, so it is quite popular

This is on my to-do list after I finish the thesis, feel free to work on it if you want

Source code (recreate models from scratch)

Some people, institutions and licenses define source code as the preferred format for modification.

In mimic we have some data that is shipped as C code. We distribute this C code, but it is auto generated and it is not the preferred source for modification. As an example, the lexicon (#15), the letter to sound rules or the voice models.

Far from being a simple licensing issue, this impacts the repository size and more importantly our ability to understand where the code comes from and how to fix issues fast. On the other hand having data preprocessed in the repository reduces the build dependencies (festival speech synthesis would be required) and the build time (voice models are already created).

For a fast build time and quick testing with ability to fix bugs we could include the cmulex in source format AND its autogenerated equivalent C source code (both the lexicon and letter to sound rules created from it). But this would increase the repository size a lot if a correction is made (we already are at about 500 MB!).

Another option could be to keep all raw data under version control using git-lfs (Large file support) [1] but I don't have experience in that so I don't know how easy it is to set up.

Ideas and personal preferences are very welcome

[1] https://git-lfs.github.com/

"make test" broken

make -k test fails with this output:

Makefile:79: warning: overriding recipe for target 'multi_thread'
Makefile:76: warning: ignoring old recipe for target 'multi_thread'
make[1]: *** No rule to make target 'test'.
Makefile:130: recipe for target 'test' failed
make: *** [test] Error 2

Voice list management

When not compiling a voice with the system but using the load-functionality (e.g. using cst_cg_load_voice), the global list mimic_voice_list does not work intuitively.

Voice loading can be seen in mimic_main.c:

    if (mimic_voice_list == NULL)
        mimic_set_voice_list(voicedir);
    if (desired_voice == 0)
        desired_voice = mimic_voice_select(NULL);

mimic_voice_select searches the voice list and if it is not found, it tries to load the voice from file - but not if the list is NULL/empty in the first place. In that case mimic_voice_select returns NULL.
Therefore in the example tool mimic_main.c, the global mimic_voice_list is directly accessed.

A possible workaround to load the voice is to ignore the lists and use:

   mimic_init();
   mimic_add_lang("eng",usenglish_init,cmu_lex_init);
   ...
   voice = mimic_voice_load(path);

Probably mimic_voice_select could be modified so that it also works with empty lists too.
Also, there are no functions to remove a voice from a list and delete it.

RFE: add speech-dispatcher plugin

speech-dispatcher is a high-level device independent layer for TTS and is the universal interface that is used by GNOME and KDE for desktop speech integration.

https://devel.freebsoft.org/speechd

It supports a plugin interface so I believe a mimic speechd plugin would be a good initial integration for Mycroft/minic integration for general desktops.

Build fails with gcc 6.2.0 on ubuntu 16.10

 make
Making all in .
make[1]: Entering directory '/mimic'
  CC       src/audio/libttsmimic_la-au_alsa.lo
In file included from /usr/include/stdlib.h:24:0,
                 from src/audio/au_alsa.c:48:
/usr/include/features.h:148:3: error: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Werror=cpp]
 # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
   ^~~~~~~
cc1: all warnings being treated as errors
Makefile:2985: recipe for target 'src/audio/libttsmimic_la-au_alsa.lo' failed
make[1]: *** [src/audio/libttsmimic_la-au_alsa.lo] Error 1
make[1]: Leaving directory 'mimic'
Makefile:4111: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1

Why does build take so long on Linux?

I am in the build process (make) and it seems to take too long. It is stuck on a line that reads:
CC lang/vid_gb_ap/libttsmimic_lang_vid_gb_ap_la-vid_gb_ap_cg_01_mcep_trees.lo

What is wrong? Is it an error? Should I wait?
I am running it on GalliumOS on an Acer Chromebook 14.

Request: Test on a raspberry pi unit needed (with faster compilation)

Hi,

One of the most common complaints about mimic is the long compile times it requires. The main cause for that is that the mycroft voice is compiled embedded in the mimic binary, instead of being loaded on runtime from a file.

We don't load the voice from a file on runtime because it is too slow. However, if we were able to improve the voice loading functions then we could stop embedding it at compilation time. So far, @forslund has made some improvements in #85 but still there is room for improvement.

I need someone to test a command that loads the mycroft voice from file. Then that person needs to compile mimic with a patch that may improve voice loading performance slightly and then check if there is a significant improvement or not.

  1. Download and compile the development version

We will disable all the embedded voices (with --disable-voices-all) to make compilation much faster:

git clone https://github.com/MycroftAI/mimic.git
cd mimic
git checkout development
./configure --disable-voices-all
make
  1. Test the timings (copy the output of this command)
time ./mimic -voice voices/mycroft_voice_4.0.flitevox  -t "" a.wav
  1. Clean up:
make distclean
  1. Try the patched version:
git remote add zeehio https://github.com/zeehio/mimic.git
git checkout zeehio/cg_maybe_faster_load
./configure --disable-voices-all
make
  1. Test the patched version (copy the output of this command)
time ./mimic -voice voices/mycroft_voice_4.0.flitevox  -t "" a.wav

Thanks to anyone who can help on this

HTS and vocoder alternatives

Introduction (to be expanded)

  • waveform synthesis techniques: there are many ways to synthesize a waveform: diphone (e.g. cmu_us_kal16 in mimic) cluster units, clustergen (eg cmu_us_slt, vid_us_ap), HTS... Each technique has pros and cons.
  • hts voices.
    • hts Advantages: small footprint good quality.
    • hts issues: a bit more robotic than unit selection based engines, this can be improved with a better vocoder. Not easy to find free vocoder implementations.
    • There is a version of Flite with hts engine support here
  • HTS support would be good for mimic due to the small voice footprint.
  • Finding and integrating a good vocoder in mimic could make a difference in voice quality with small footprint.
  • Some discussion was going on here #61.

(I will copy a summary of the discussion here)

Warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]

It have passed some hours from the ./dev_setup.sh from 'mycroft-core' and when it reaches the part where works in 'mimic' i got tons of these warnings. In my raspberry pi 3 which is a single board computer, and with 4GB Swap, it finished compiling or something like that in about 2 hours or maybe less. And now in my AMD64 Mini Barebone PC (Giada A51B: AMD Dual-Core T56N, 4GB RAM, 128GB-SSD) is taking so much more and now i only see that warnings:

Warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]

My RPi3 has Raspbian with PIXEL (the lastest and updated)
and my other PC has Linux Mint 18, updated with apt-get upgrade, etc...

Is that normal?
Thanks

Add Contributing section to README.md

A section should be added to the README.md detailing the development model. Should at least include

  • Always create pull requests against the Development branch
  • Coding style requirements

In addition commit log message style might be nice but may be pushing the bureaucracy one step too far =)

(Probably) frequent name clashes

This is not really an issue with mimic but probably still worth considering:
I found that the function names in src/audio/audio.c frequently clash with other libraries (e.g. Acapela TTS). Perhaps prefixing is an option here?

Also, it seems that when defining CST_AUDIO_NONE (and e.g. using another audio framework), only audio_stream_chunk in https://github.com/MycroftAI/mimic/blob/master/src/audio/au_streaming.c requires audio.c - and it seems to be an example function only.
So probably surrounding audio_stream_chunk with #ifndef CST_AUDIO_NONE would make sense? (or even move that function to an example file).

Fedora 25 mimic compile errors

Problem installing from git on Fedora 25, build_host_setup_fedora.sh completes without errors. ./dev_setup.sh fails when compiling mimic

checking for gcc... gcc
checking whether the C compiler works... no
configure: error: in `/home/james/mycroft-core/mimic':
configure: error: C compiler cannot create executables

config.log says error 77

-----------

confdefs.h.

-----------

/* confdefs.h */
#define PACKAGE_NAME "mimic"
#define PACKAGE_TARNAME "mimic"
#define PACKAGE_VERSION "1.2.0.1"
#define PACKAGE_STRING "mimic 1.2.0.1"
#define PACKAGE_BUGREPORT "https://github.com/MycroftAI/mimic/issues"
#define PACKAGE_URL ""
#define PACKAGE "mimic"
#define VERSION "1.2.0.1"

configure: exit 77

Any help would be appreciated :-)

[Build error] Libicu 58 not found

Hi! I was trying to install mycroft-core on my ubuntu 17.04 desktop (64-bit), and an error prompted saying:

/usr/bin/ld: warning: libicui18n.so.58, needed by ./.libs/libttsmimic.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicuuc.so.58, needed by ./.libs/libttsmimic.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicudata.so.58, needed by ./.libs/libttsmimic.so, not found (try using -rpath or -rpath-link)

I couldn't find the mentioned package on the package manager.
Is there anything I can do to make this work!
Thx!

Unable to create audio on Raspberry Pi3

I've tried quite a few times to create voices with Mimic on my rPi3.

I'm able to run ./bin/mimic -t "Hello. Doctor. Name. Continue. Yesterday. Tomorrow." -o hello.wav
but when i run aplay hello.wav I hear no audio, but any other wav file plays just fine.

I installed this with mycroft-core using ./dev_setup.sh

iOS Port

Currently the engine works on Linux, Android, Mac OS X, and Windows. If we can make it work on iOS in relatively short order that would be a big win.

Mimic doesn't read whole text

In command line I tried running ./mimic -t where string was about 1800 characters long, but Mimic didn't read the whole thing. It got to about one third of the text and then just stopped mid-word at about 460 character.
All characters were either letters, numbers or punctuation, combined into proper grammatical sentences.
I tried with other long texts and got same result.

I use updated Fedora 23 x86_64 and BASH as command line interpreter.

Sound output on Mac OS X

The engine works on Mac OS X but cannot generate sound output (only file output is possible). Many users would be grateful if also the Mac could speak.

build issues with aclocal

Getting home from vacation trying to build the mimic development branch I got the following error

CDPATH="${ZSH_VERSION+.}:" && cd .. && /bin/bash /home/ake/projects/c/mimic/config/missing aclocal-1.14 -I m4
/home/ake/projects/c/mimic/config/missing: line 81: aclocal-1.14: command not found
WARNING: 'aclocal-1.14' is missing on your system.
         You should only need it if you modified 'acinclude.m4' or
         'configure.ac' or m4 files included by 'configure.ac'.
         The 'aclocal' program is part of the GNU Automake package:
         <http://www.gnu.org/software/automake>
         It also requires GNU Autoconf, GNU m4 and Perl in order to run:
         <http://www.gnu.org/software/autoconf>
         <http://www.gnu.org/software/m4/>
         <http://www.perl.org/>
Makefile:1752: recipe for target '../aclocal.m4' failed
make: *** [../aclocal.m4] Error 127

This is because aclocal is version 1.15 on my ubuntu system and not 1.14. On my debian VM I get the same issue with a fresh checkout of the project unless I install automake.

The Makefile seem to include rules for building the aclocal.m4 and the dependencies may be considered updated when checked out and this trigger the error in question.

I see two solutions:
1: remove the files generated by autogen.sh and the procedure for building shall include running ./autogen.sh (I've prepared a branch for this named autogen)

2: The rules should exclude aclocal.m4 (This I haven't quite figured out how to accomplish)

Create audio test

If you want to work on this just say it. If no one steps up I will do it whenever I have time (I would rather focus on multilingual support).

Mimic should have an audio test. I suggest using a pure tone so we have a "unit test". If creating the wave is an issue for you, you can synthesize text and adjust timings accordingly.

  • Print on the screen the compiled audio output module (alsa or pulseaudio or portaudio...)
  • Test that there is sound in the output
    • Create a cst_wave made of a pure frequency (sine at 440Hz during 1 second)
    • Play the cst_wave
  • Test that it takes at least 40 seconds to run (audio not interrupted before the end)
    • Create a cst_wave made of the concatenation of:
      • Sine at 440 Hz during 1 second
      • Sine at 880 Hz during 1 second
      • Sine at 1760 Hz during 36 second
      • Sine at 880 Hz during 1 second
      • Sine at 440 Hz during 1 second
    • Play the cst_wave, measure the time it takes to be played.
  • Test audio_write does not block: Assert that there is no audible pause between the two waves:
    • Wave 1:
      • Sine at 440 Hz during 3 second
      • Sine at 880 Hz during 3 second
      • Sine at 1760 Hz during 6 second
    • Play wave 1
    • sleep 0.2 seconds (simulates that Wave 2 generation takes time, but less than Wave1 seconds. This sleep runs while audio is still playing)
    • Wave 2:
      • Sine at 880 Hz during 3 second
      • Sine at 440 Hz during 3 second
    • Play wave 2
  • Test that audio can be interrupted. (Measure that time running < 12 seconds)
    • Create a cst_wave made of the concatenation of:
      • Sine at 440 Hz during 4 second
      • Sine at 880 Hz during 4 second
      • Sine at 1760 Hz during 4 second
    • Play the cst_wave, measure the time it takes to be played.
    • sleep 2 seconds
    • Interrupt audio

Wiki Documentation

First and foremost I think it is important to have solid documentation around this project. I encourage those contributing to try and put as much as possible in the Github wiki in this repo.

Some things I'd like to have in the Wiki:

  1. How to easily install.

  2. How to use the software.

  3. How to adjust things like speed, cadence, and tone.

I think these are good questions to get us started, once we develop an understanding of the software and document it. I think we can move on from there.

Configure step doesn't check for libicu

When building mimic on a system without libicu the configuration step finishes without issues but the make step fails when trying to build objects requiring libicu

Tests fail

I see the following test failures.

Test project...                 [ FAILED ]
  lex_test_main.c:69: Check strcmp(val_string(val_car(syl)), tok) == 0... failed
  lex_test_main.c:69: Check strcmp(val_string(val_car(syl)), tok) == 0... failed
  lex_test_main.c:69: Check strcmp(val_string(val_car(syl)), tok) == 0... failed
Test atypical...                [ FAILED ]
  lex_test_main.c:69: Check strcmp(val_string(val_car(syl)), tok) == 0... failed

Git revision: 0513650
Configuration: ./configure --prefix=/usr --with-audio=alsa
Test command: make -k test
Host system: x86_64 Arch Linux

Doesn't work with Jabra 510 usb speaker

Hi all,

Mimic though alsa currently forces the audio output to be a single channel. I believe this device doesn't have support for single channel audio (yeah...odd). I get the following error:

$ ./bin/mimic -t "hello"
audio_open_alsa: failed to set number of channels to 1. Invalid argument.

In verbose mode doesn't seem to give any further information.

This is some information on the Jabra speaker according to aplay (happy to provide any other information):

$ aplay -L
null
    Discard all samples (playback) or generate zero samples (capture)
pulse
    PulseAudio Sound Server
sysdefault:CARD=USB
    Jabra SPEAK 510 USB, USB Audio
    Default Audio Device
front:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    Front speakers
surround21:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    2.1 Surround output to Front and Subwoofer speakers
surround40:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    4.0 Surround output to Front and Rear speakers
surround41:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    4.1 Surround output to Front, Rear and Subwoofer speakers
surround50:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    5.0 Surround output to Front, Center and Rear speakers
surround51:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    5.1 Surround output to Front, Center, Rear and Subwoofer speakers
surround71:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    7.1 Surround output to Front, Center, Side, Rear and Woofer speakers
iec958:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    IEC958 (S/PDIF) Digital Audio Output
dmix:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    Direct sample mixing device
dsnoop:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    Direct sample snooping device
hw:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    Direct hardware device without any conversions
plughw:CARD=USB,DEV=0
    Jabra SPEAK 510 USB, USB Audio
    Hardware device with all software conversions
$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 1: USB [Jabra SPEAK 510 USB], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

Some words are pronounced incorrectly.

Some words are pronounced incorrectly.

The two that come to mind in my testing are
Atheism and Penis

Atheism is pronounced by Mimic

A thigh ism

Now Theism, Theist and Atheist are pronounced correctly so this is a bit puzzling why atheism is pronounced differently.

Penis is pronounced by mimic like the words

pen is

it should be pronounced like

pee nis

Now I tested this out with a few voices and had identical results.

CG flitevox file compatibility

Clustergen (CG) "flitevox" files are binary dumps created by cst_cg_dump_voice.
This restricts compatibility with voices trained on other systems.
Endianness is checked in cst_cg_read_header but just throws an error instead of byte-swapping.
There is a number of uses of e.g. sizeof(int), so different word sizes matter.

Android

Can you provide an Android version via F-Droid? Seems there are no good FOSS TTS systems for Android.

Build fails with linker errors

/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/lib/libmimic.a(au_alsa.o): In function `audio_open_alsa':
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:92: undefined reference to `snd_pcm_hw_params_sizeof'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:92: undefined reference to `snd_pcm_hw_params_sizeof'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:95: undefined reference to `snd_pcm_open'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:99: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:104: undefined reference to `snd_pcm_hw_params_any'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:107: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:108: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:111: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:116: undefined reference to `snd_pcm_hw_params_set_access'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:119: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:120: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:122: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:144: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:145: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:152: undefined reference to `snd_pcm_hw_params_set_format'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:155: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:156: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:158: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:164: undefined reference to `snd_pcm_hw_params_set_rate'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:167: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:168: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:171: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:179: undefined reference to `snd_pcm_hw_params_set_channels'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:182: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:183: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:186: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:191: undefined reference to `snd_pcm_hw_params'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:194: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:195: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:197: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:202: undefined reference to `snd_pcm_state'
/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/lib/libmimic.a(au_alsa.o): In function `audio_close_alsa':
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:229: undefined reference to `snd_pcm_close'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:230: undefined reference to `snd_config_update_free_global'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:233: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/lib/libmimic.a(au_alsa.o): In function `audio_flush_alsa':
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:336: undefined reference to `snd_pcm_delay'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:342: undefined reference to `snd_pcm_drain'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:345: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:348: undefined reference to `snd_pcm_prepare'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:351: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/lib/libmimic.a(au_alsa.o): In function `audio_write_alsa':
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:299: undefined reference to `snd_pcm_writei'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:304: undefined reference to `snd_pcm_wait'
/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/lib/libmimic.a(au_alsa.o): In function `recover_from_error':
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:244: undefined reference to `snd_pcm_prepare'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:250: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:256: undefined reference to `snd_pcm_resume'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:258: undefined reference to `snd_pcm_wait'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:262: undefined reference to `snd_pcm_prepare'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:268: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:277: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/lib/libmimic.a(au_alsa.o): In function `audio_drain_alsa':
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:359: undefined reference to `snd_pcm_drop'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:362: undefined reference to `snd_strerror'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:365: undefined reference to `snd_pcm_prepare'
/home/neikos/projects/mycroft-core/mimic/src/audio/au_alsa.c:368: undefined reference to `snd_strerror'
clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)
Makefile:70: recipe for target '../bin/find_sts' failed
make[1]: *** [../bin/find_sts] Error 1
config/common_make_rules:133: recipe for target '/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/obj//.make_build_dirs' failed
make: *** [/home/neikos/projects/mycroft-core/mimic/build/x86_64-linux-gnu/obj//.make_build_dirs] Error 2

Kick Off a Release

Current version works pretty damn good, and if we kick off a stable release (it seems stable right now). We can use that version for Mycroft.

Optimize voice loading

To reduce startup time the voice loading should be optimized.

Possibilities

  • Reduce number of calls to cst_fread()
  • Increase read buffer using setvbuf()
  • Restructure flitevox-file to allow for more efficient reading
  • inline commonly used functions (?)

Cleanup README

I started doing some work on cleaning up the README, but we really need an installation and usage blurb in there that isn't overwhelming.

Cannot quit when in loop mode

When starting mimic with -l for looping, ctrl-c causes the Shutdown requested! message to be infinitely printed without quitting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.