GithubHelp home page GithubHelp logo

bbc / unicode-bidirectional Goto Github PK

View Code? Open in Web Editor NEW
48.0 40.0 11.0 1.74 MB

A Javascript implementation of the Unicode 9.0.0 Bidirectional Algorithm

License: MIT License

JavaScript 98.33% HTML 1.13% Shell 0.54%
unicode-bidirectional-algorithm unicode frontend library dpub innovation

unicode-bidirectional's Introduction

unicode-bidirectional

Code Climate Test Coverage Build Status
A Javascript implementation of the Unicode 9.0.0 Bidirectional Algorithm

This is an implementation of the Unicode Bidirectional Algorithm (UAX #9) that works in both Browser and Node.js environments. The implementation is conformant as per definition UAX#9-C1.

Installation

npm install unicode-bidirectional --save

Usage

unicode-bidirectional is declared as a Universal Module (UMD), meaning it can be used with all conventional Javascript module systems:

1. ES6

import { resolve, reorder } from 'unicode-bidirectional';

const codepoints = [0x28, 0x29, 0x2A, 0x05D0, 0x05D1, 0x05D2]
const levels = resolve(codepoints, 0);  // [0, 0, 0, 1, 1, 1]
const reordering = reorder(codepoints, levels); // [0x28, 0x29, 0x2A, 0x05D2, 0x05D1, 0x05D0]

2. CommonJS

var UnicodeBidirectional = require('unicode-bidirectional/dist/unicode.bidirectional');
var resolve = UnicodeBidirectional.resolve;
var reorder = UnicodeBidirectional.reorder;

var codepoints = [0x28, 0x29, 0x2A, 0x05D0, 0x05D1, 0x05D2]
var levels = resolve(codepoints, 0);  // [0, 0, 0, 1, 1, 1]
var reordering = reorder(codepoints, levels); // [0x28, 0x29, 0x2A, 0x05D2, 0x05D1, 0x05D0]

3. RequireJS

require(['UnicodeBidirectional'], function (UnicodeBidirectional) {
  var resolve = UnicodeBidirectional.resolve;
  var reorder = UnicodeBidirectional.reorder;

  var codepoints = [0x28, 0x29, 0x2A, 0x05D0, 0x05D1, 0x05D2]
  var levels = resolve(codepoints, 0);  // [0, 0, 0, 1, 1, 1]
  var reordering = reorder(codepoints, levels); // [0x28, 0x29, 0x2A, 0x05D2, 0x05D1, 0x05D0]
});

4. HTML5 <script> tag

<script src="unicode.bidirectional.js" /> <!-- exposes window.UnicodeBidirectional -->
var resolve = UnicodeBidirectional.resolve;
var reorder = UnicodeBidirectional.reorder;

var codepoints = [0x28, 0x29, 0x2A, 0x05D0, 0x05D1, 0x05D2]
var levels = resolve(codepoints, 0);  // [0, 0, 0, 1, 1, 1]
var reordering = reorder(codepoints, levels); // [0x28, 0x29, 0x2A, 0x05D2, 0x05D1, 0x05D0]

You can download unicode.bidirectional.js from Releases. Using this file with a <script> tag will expose UnicodeBidirectional as global variable on the window object.

API

resolve(codepoints, paragraphlevel[, automaticLevel = false])

Returns the resolved levels associated to each codepoint in codepoints[1]. This levels array determines: (i) the relative nesting of LTR and RTL characters, and hence (ii) how characters should be reversed when displayed on the screen.

The input codepoints are assumed to be all be in one paragraph that has a base direction of paragraphLevel – this is a Number that is either 0 or 1 and represents whether the paragraph is left-to-right (0) or right-to-left (1). automaticLevel is an optional Boolean flag that when present and set to true, causes this function to ignore the paragraphlevel argument and instead attempt to deduce the paragraph level from the codepoints. [2]
Neither of the two input arrays are mutated.

reorder(codepoints, levels)

Returns the codepoints in codepoints reordered (i.e. permuted) according the levels array. [3]
Neither of the two input arrays are mutated.

reorderPermutation(levels[, IGNORE_INVISIBLE = false])

Returns the reordering that levels represents as an permutation array. When this array has an element at index i with value j, it denotes that the codepoint previous positioned at index i is now positioned at index j. [4]
The input array is not mutated. The IGNORE_INVISIBLE parameter controls whether or not invisible characters (characters with a level of 'x' [5]) are to be included in the permutation array. By default, they are included in the permutation (they are not ignored, hence IGNORE_INVISIBLE is false).

mirror(codepoints, levels)

Replaces each codepoint in codepoints with its mirrored glyph according to rule L4 and the levels array.
Neither of the two input arrays are mutated.

constants

An object containing metadata used by the bidirectional algorithm. This object includes the following keys:

  • mirrorMap: a map mapping a codepoint to its mirrored counterpart, e.g. looking up "<" gives ">". If a codepoint does not have a mirrored counterpart, then there is no key-value pair in the map and so a lookup will give undefined. [6]
  • oppositeBracket: a map mapping a codepoint to its bracket pair counterpart, e.g. looking up "(" gives ")". If a codepoint does not have a bracket pair counterpart, then there is no key-value pair in the map and so a lookup will give undefined. [7]
  • openingBrackets: a set containing all brackets that are opening brackets. [7]
  • closingBrackets: a set containing all brackets that are closing brackets. [7]

Additional Notes:

For all the above functions, codepoints are represented by an Array of Numbers where each Number denotes the Unicode codepoint of the character, that is an integer between 0x0 and 0x10FFFF inclusive. levels are represented by an Array of Numbers where Number is an integer between 0 and 127 inclusive. One or more entries of levels may be the string 'x'. This denotes a character that does not have a level [5].

[1]: Codepoints are automatically converted to NFC normal form if they are not already in that form.
[2]: This function deduces the paragraph level according to: UAX#P1, UAX#P2 and UAX#P3.
[3]: This is an implementation of UAX#9-L2.
[4]: More formally known as the one-line notation for permutations. See Wikipedia.
[5]: Some characters have a level of x – the levels array has a string 'x' instead of a number. This is expected behaviour. The reason is because the Unicode Bidirectional algorithm (by rule X9.) will not assign a level to certain invisible characters / control characters. They are basically completely ignored by the algorithm. They are invisible and so have no impact on the visual RTL/LTR ordering of characters. Most of the invisible characters that fall into this category are in this list.
[6]: This is taken from BidiMirroring.txt.
[7]: This is taken from BidiBrackets.txt.

Polyfills

unicode-bidirectional uses the following ECMAScript 2015 (ES5) features that are not fully supported by Internet Explorer and older versions of other browsers:

If you are targeting these browsers, you'll need to add one or more Polyfill libraries to fill in these features (for example, es6-shim and unorm).

More Info

For other Javascript Unicode Implementations see:

License

MIT.
Copyright (c) 2017 British Broadcasting Corporation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.