Detect the language of text.
What’s so cool about franc?
- franc can support more languages(†) than any other library
- franc is packaged with support for 75, 175, or 375 languages
- franc has a CLI
† - Based on the UDHR, the most translated document in the world.
What’s not so cool about franc?
franc supports many languages, so make sure to pass it big documents, to get reliable results.
Installation
npm:
npm install franc
This installs the franc
package, with support for 175 languages
(languages which have 1 million or more speakers). franc-min
(75
languages, 8m or more speakers) and franc-all
(all 375 possible
languages) are also available. Finally, use franc-cli
to install the
CLI.
Browser builds for franc-min
, franc
, and franc-all
are
available on GitHub Releases.
Usage
var franc = require('franc');
franc('Alle menslike wesens word vry'); //=> 'afr'
franc('এটি একটি ভাষা একক IBM স্ক্রিপ্ট'); //=> 'ben'
franc('Alle mennesker er født frie og'); //=> 'nno'
franc(''); //=> 'und'
franc('the'); //=> 'und'
/* You can change what’s too short (default: 10): */
franc('the', {minLength: 3}); // 'sco'
.all
franc.all('O Brasil caiu 26 posições');
Yields:
[ [ 'por', 1 ],
[ 'src', 0.8797557538750587 ],
[ 'glg', 0.8708313762329732 ],
[ 'snn', 0.8633161108501644 ],
[ 'bos', 0.8172851103804604 ],
... 116 more items ]
whitelist
franc.all('O Brasil caiu 26 posições', {whitelist: ['por', 'spa']});
Yields:
[ [ 'por', 1 ], [ 'spa', 0.799906059182715 ] ]
blacklist
franc.all('O Brasil caiu 26 posições', {blacklist: ['src', 'glg']});
Yields:
[ [ 'por', 1 ],
[ 'snn', 0.8633161108501644 ],
[ 'bos', 0.8172851103804604 ],
[ 'hrv', 0.8107092531705026 ],
[ 'lav', 0.810239549084077 ],
... 114 more items ]
CLI
Install:
npm install franc-cli --global
Use:
CLI to detect the language of text
Usage: franc [options] <string>
Options:
-h, --help output usage information
-v, --version output version number
-m, --min-length <number> minimum length to accept
-w, --whitelist <string> allow languages
-b, --blacklist <string> disallow languages
-a, --all display all guesses
Usage:
# output language
$ franc "Alle menslike wesens word vry"
# afr
# output language from stdin (expects utf8)
$ echo "এটি একটি ভাষা একক IBM স্ক্রিপ্ট" | franc
# ben
# blacklist certain languages
$ franc --blacklist por,glg "O Brasil caiu 26 posições"
# src
# output language from stdin with whitelist
$ echo "Alle mennesker er født frie og" | franc --whitelist nob,dan
# nob
Supported languages
Package | Languages | Speakers |
---|---|---|
franc-min |
75 | 8M or more |
franc |
175 | 1M or more |
franc-all |
375 | - |
Derivation
Franc is a derivative work from guess-language (Python, LGPL), guesslanguage (C++, LGPL), and Language::Guess (Perl, GPL). Their creators granted me the rights to distribute franc under the MIT license: respectively, Maciej Ceglowski, Jacob R. Rideout, and Kent S. Johnson.