GithubHelp home page GithubHelp logo

eloraiby / arabtype Goto Github PK

View Code? Open in Web Editor NEW
68.0 68.0 16.0 166 KB

a small and simple implementation that transform isolated arabic utf8 character strings into contextual forms.

C 71.30% C++ 25.41% QMake 3.29%

arabtype's People

Contributors

banx avatar eloraiby avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

arabtype's Issues

Farsi Support & Presentation Forms A?

I am successfully using your library. It works great - Thanks :)

I wonder, would it be possible to add support for Farsi? I tried to render Farsi text, but the glyphs don't appear correctly connected? I believe part of the Presentation Form A characters would to be used? How difficult would that be to add? Is it possible?

Wrong Output for بذيء

The following text بذيء appears to produce the wrong output by connecting the characters. Is there any way to fix it?

Correct Rendering of all kind of texts?

I adapted my algorithm to Java. It works great, but letters that don't need to be transformed don't seem to work:

This is the text:
مستوى صعوبة الحاسوب

This is how it is rendered:
مستو صعوبة الحاسوب

Basically, the 'ى' is missing!

Is that a bug in my code?

public static char correct(char prev, char next, char ch) {
  if ((ch >= ARABIC_LETTER_START) && (ch <= ARABIC_LETTER_FINAL)) {
    // covert Arabic letter - https://github.com/eloraiby/arabtype/blob/master/arabtype.c
    boolean isLa  = isLamAlef(ch, next);
    boolean isApl = isAlefPrevLam(prev, ch);

    boolean isLapl = isLa | isApl;

    // determine char to return
    if(isLapl) {
      int index = ((isLinkingType(ch) ? 1 : 0) << 1) | (isLinkingType(prev) ? 1 : 0);
      return (char)ARABIC_FORMS_B[next -  ARABIC_LETTER_START][1][index];
    }
    else {
      if (isApl) {
        return ch;  // skip previously processed lam alef
      }
      else {
        int index   = (((isArabicLetter(next) ? 1 : 0) & (isLinkingType(ch) ? 1 : 0)) << 1) | (isLinkingType(prev) ? 1 : 0);
        return (char)ARABIC_FORMS_B[ch -  ARABIC_LETTER_START][0][index];
      }
    }

    // NOTE: compact form of the above...
    // int index = ((((isLapl | isArabicLetter(next)) & isLinkingType(ch)) ? 1 : 0) << 1) | (isLinkingType(prev) ? 1 : 0);
    // int ref = (next * (isLa ? 1 : 0)) + (ch * (isLa ? 0 : 1)) - ARABIC_LETTER_START;
    // return (char)ARABIC_FORMS_B[ref][isLapl ? 1 : 0][index];
  }
  else {
    // not an Arabic letter to be converted!
    return ch;
  }
}

private static final char ARABIC_LETTER_START = 0x0621;
private static final char ARABIC_LETTER_FINAL = 0x064A;

// private static final int ENDING = 1;
private static final int INITIAL = 2;
private static final int MEDIAL = 3;

private static final int UNICODE_LAM = 0x644;

private static boolean isArabicLetter(char cp)    {
  return ( cp >= ARABIC_LETTER_START && cp <=  ARABIC_LETTER_FINAL );
}

private static boolean isLamAlef(char cp, char next)  {
  return cp == UNICODE_LAM && isArabicLetter(next) && ARABIC_FORMS_B[next - ARABIC_LETTER_START][1][INITIAL] != 0;
}

private static boolean isAlefPrevLam(char prev, char cp)  {
  return prev == UNICODE_LAM && isArabicLetter(cp) && ARABIC_FORMS_B[cp - ARABIC_LETTER_START][1][INITIAL] != 0;
}

private static boolean isLinkingType(char cp) {
  return isArabicLetter(cp) && ARABIC_FORMS_B[cp - ARABIC_LETTER_START][0][MEDIAL] != 0;
}

/** Table to convert to Arabic presentation form B. */
private static final int[][][] ARABIC_FORMS_B = {
    { {0xFE80, 0xFE80,      0,      0}, {-1, -1, 0, 0} },    // hamza  (0)
    { {0xFE81, 0xFE82,      0,      0}, {-1, -1, 0xFEF5, 0xFEF6} },  // 2alif madda  (1)
    { {0xFE83, 0xFE84,      0,      0}, {-1, -1, 0xFEF7, 0xFEF8} },  // 2alif hamza  (2)
    { {0xFE85, 0xFE86,      0,      0}, {-1, -1, 0, 0} },    // waw hamza  (3)
    { {0xFE87, 0xFE88,      0,      0}, {-1, -1, 0xFEF9, 0xFEFA} },  // 2alif hamza maksoura  (4)
    { {0xFE89, 0xFE8A, 0xFE8B, 0xFE8C}, {-1, -1, 0, 0} },    // 2alif maqsoura hamza  (5)
    { {0xFE8D, 0xFE8E,      0,      0}, {-1, -1, 0xFEFB, 0xFEFC} },  // 2alif  (6)
    { {0xFE8F, 0xFE90, 0xFE91, 0xFE92}, {-1, -1, 0, 0} },    // ba2    (7)
    { {0xFE93, 0xFE94,      0,      0}, {-1, -1, 0, 0} },    // ta2 marbouta  (8)
    { {0xFE95, 0xFE96, 0xFE97, 0xFE98}, {-1, -1, 0, 0} },    // ta2    (9)
    { {0xFE99, 0xFE9A, 0xFE9B, 0xFE9C}, {-1, -1, 0, 0} },    // tha2    (10)
    { {0xFE9D, 0xFE9E, 0xFE9F, 0xFEA0}, {-1, -1, 0, 0} },    // jim    (11)
    { {0xFEA1, 0xFEA2, 0xFEA3, 0xFEA4}, {-1, -1, 0, 0} },    // 7a2    (12)
    { {0xFEA5, 0xFEA6, 0xFEA7, 0xFEA8}, {-1, -1, 0, 0} },    // kha2    (13)
    { {0xFEA9, 0xFEAA,      0,      0}, {-1, -1, 0, 0} },    // dal    (14)
    { {0xFEAB, 0xFEAC,      0,      0}, {-1, -1, 0, 0} },    // dhal    (15)
    { {0xFEAD, 0xFEAE,      0,      0}, {-1, -1, 0, 0} },    // ra2    (16)
    { {0xFEAF, 0xFEB0,      0,      0}, {-1, -1, 0, 0} },    // zayn    (17)
    { {0xFEB1, 0xFEB2, 0xFEB3, 0xFEB4}, {-1, -1, 0, 0} },    // syn    (18)
    { {0xFEB5, 0xFEB6, 0xFEB7, 0xFEB8}, {-1, -1, 0, 0} },    // shin    (19)
    { {0xFEB9, 0xFEBA, 0xFEBB, 0xFEBC}, {-1, -1, 0, 0} },    // sad    (20)
    { {0xFEBD, 0xFEBE, 0xFEBF, 0xFEC0}, {-1, -1, 0, 0} },    // dad    (21)
    { {0xFEC1, 0xFEC2, 0xFEC3, 0xFEC4}, {-1, -1, 0, 0} },    // tah    (22)
    { {0xFEC5, 0xFEC6, 0xFEC7, 0xFEC8}, {-1, -1, 0, 0} },    // thah    (23)
    { {0xFEC9, 0xFECA, 0xFECB, 0xFECC}, {-1, -1, 0, 0} },    // 3ayn    (24)
    { {0xFECD, 0xFECE, 0xFECF, 0xFED0}, {-1, -1, 0, 0} },    // ghayn  (25)
    { {     0,      0,      0,      0}, {-1, -1, 0, 0} },    //    (26)
    { {     0,      0,      0,      0}, {-1, -1, 0, 0} },    //    (27)
    { {     0,      0,      0,      0}, {-1, -1, 0, 0} },    //    (28)
    { {     0,      0,      0,      0}, {-1, -1, 0, 0} },    //    (29)
    { {     0,      0,      0,      0}, {-1, -1, 0, 0} },    //    (30)
    { {0x0640, 0x0640, 0x0640, 0x0640}, {-1, -1, 0, 0} },    // wasla  (31)
    { {0xFED1, 0xFED2, 0xFED3, 0xFED4}, {-1, -1, 0, 0} },    // fa2    (32)
    { {0xFED5, 0xFED6, 0xFED7, 0xFED8}, {-1, -1, 0, 0} },    // qaf    (33)
    { {0xFED9, 0xFEDA, 0xFEDB, 0xFEDC}, {-1, -1, 0, 0} },    // kaf    (34)
    { {0xFEDD, 0xFEDE, 0xFEDF, 0xFEE0}, {-1, -1, 0, 0} },    // lam    (35)
    { {0xFEE1, 0xFEE2, 0xFEE3, 0xFEE4}, {-1, -1, 0, 0} },    // mim    (36)
    { {0xFEE5, 0xFEE6, 0xFEE7, 0xFEE8}, {-1, -1, 0, 0} },    // noon    (37)
    { {0xFEE9, 0xFEEA, 0xFEEB, 0xFEEC}, {-1, -1, 0, 0} },    // ha2    (38)
    { {0xFEED, 0xFEEE,      0,      0}, {-1, -1, 0, 0} },    // waw    (39)
    { {0xFEFF, 0xFEF0,      0,      0}, {-1, -1, 0, 0} },    // 2alif maksoura  (40)
    { {0xFEF1, 0xFEF2, 0xFEF3, 0xFEF4}, {-1, -1, 0, 0} },    // ya2    (41)
};

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.