GithubHelp home page GithubHelp logo

rlugojr / hunspell Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hunspell/hunspell

0.0 1.0 0.0 5.48 MB

Home Page: http://hunspell.github.io/

License: GNU Lesser General Public License v3.0

Makefile 1.62% Shell 2.83% M4 6.43% C 32.00% C++ 56.28% Yacc 0.30% Perl 0.53%

hunspell's Introduction

About Hunspell

NOTICE: Verison 2 is in the works. For contributing see version 2 specification and the folder src/hunspell2.

Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex word compounding or character encoding. Hunspell interfaces: Ispell-like terminal interface using Curses library, Ispell pipe interface, C++ class and C functions.

Hunspell's code base comes from the OpenOffice.org MySpell (http://lingucomponent.openoffice.org/MySpell-3.zip). See README.MYSPELL, AUTHORS.MYSPELL and license.myspell files. Hunspell is designed to eventually replace Myspell in OpenOffice.org.

Main features of Hunspell spell checker and morphological analyzer:

  • Unicode support (affix rules work only with the first 65535 Unicode characters)
  • Morphological analysis (in custom item and arrangement style) and stemming
  • Max. 65535 affix classes and twofold affix stripping (for agglutinative languages, like Azeri, Basque, Estonian, Finnish, Hungarian, Turkish, etc.)
  • Support complex compoundings (for example, Hungarian and German)
  • Support language specific features (for example, special casing of Azeri and Turkish dotted i, or German sharp s)
  • Handle conditional affixes, circumfixes, fogemorphemes, forbidden words, pseudoroots and homonyms.
  • Free software. Versions 1.x are licenced under LGPL, GPL, MPL tri-license. Version 2 is licenced only under GNU LGPL.

Compiling on Unix/Linux and others

autoreconf -vfi
./configure
make
make install    #if neccesary prefix with sudo
ldconfig        #not needed on windows, on linux sudo may be needed

For dictionary development, use the --with-warnings option of configure.

For interactive user interface of Hunspell executable, use the --with-ui option.

The developer packages you need to compile Hunspell's interface:

autoconf automake autopoint libtool g++

optional developer packages:

  • ncurses (need for --with-ui), eg. libncursesw5 for UTF-8
  • readline (for fancy input line editing, configure parameter: --with-readline)
  • locale and gettext (but you can also use the --with-included-gettext configure parameter)

Compiling on Windows

  1. Compiling with Mingw64 and MSYS2

Download Msys2, update everything and install the following packages:

pacman -S base-devel mingw-w64-x86_64-toolchain mingw-w64-x86_64-libtool

Open Mingw-w64 Win64 prompt and compile the same way as on Linux, see above.

  1. Compiling in Cygwin environment

Download and install Cygwin environment for Windows with the following extra packages:

  • make
  • automake
  • autoconf
  • gcc-g++ development package
  • ncurses, readline (for user interface)
  • iconv (character conversion)

###3.1. Cygwin1.dll dependent compiling

Same as on Linux.

Testing

Testing Hunspell (see tests in tests/ subdirectory):

make check

or with Valgrind debugger:

make check
VALGRIND=[Valgrind_tool] make check

For example:

make check
VALGRIND=memcheck make check

Documentation

features and dictionary format:

man 5 hunspell
man hunspell
hunspell -h

http://hunspell.github.io/

Usage

The src/tools dictionary contains ten executables after compiling:

  • affixcompress: dictionary generation from large (millions of words) vocabularies
  • analyze: example of spell checking, stemming and morphological analysis
  • chmorph: example of automatic morphological generation and conversion
  • example: example of spell checking and suggestion
  • hunspell: main program for spell checking and others (see manual)
  • hunzip: decompressor of hzip format
  • hzip: compressor of hzip format
  • makealias: alias compression (Hunspell only, not back compatible with MySpell)
  • munch: dictionary generation from vocabularies (it needs an affix file, too).
  • unmunch: list all recognized words of a MySpell dictionary
  • wordforms: word generation (Hunspell version of unmunch)

After compiling and installing (see INSTALL) you can run the Hunspell spell checker (compiled with user interface) with a Hunspell or Myspell dictionary:

hunspell -d en_US text.txt

or without interface:

hunspell
hunspell -d en_UK -l <text.txt

Dictionaries consist of an affix and dictionary file, see tests/ or http://wiki.services.openoffice.org/wiki/Dictionaries.

Using Hunspell library with GCC

Including in your program:

#include <hunspell.hxx>

Linking with Hunspell static library:

g++ -lhunspell example.cxx 

Dictionaries

Myspell & Hunspell dictionaries:

Aspell dictionaries (need some conversion):

  • ftp://ftp.gnu.org/gnu/aspell/dict

Conversion steps: see relevant feature request at http://hunspell.github.io/ .

László Németh nemeth at numbertext org

hunspell's People

Contributors

aarondandy avatar caolanm avatar dimztimz avatar doughdemon avatar edwardbetts avatar fitojb avatar fxkr avatar jwilk avatar kaboomium avatar laszlonemeth avatar mhosken avatar phajdan avatar plicease avatar plusky avatar rul avatar stbergmann avatar tanzislam avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.