GithubHelp home page GithubHelp logo

Line breaking rules about alreq HOT 4 OPEN

xfq avatar xfq commented on June 10, 2024
Line breaking rules

from alreq.

Comments (4)

asmusf avatar asmusf commented on June 10, 2024

Those requirements can be generalized to:

  1. No line should begin with sentence, clause or phrase-ending punctuation.
  2. No line should end with sentence, clause or phrase-starting punctuation. (*)
  3. No paired punctuation should appear on a line that does not contain some of the contents enclosed by the pair.

(*) this rule applies, for example, to Spanish use of inverted question and exclamation marks - it's easier to treat them as an anti-parallel case to sentence-ending punctuation instead of having regular question and exclamation mark have a dual nature by sometimes treating them as part of a pair...

These rules can the be combined with those that govern whether and how words themselves can be broken.

from alreq.

xfq avatar xfq commented on June 10, 2024

A generalized summary would help, but I think we also need to write the requirements clearly, otherwise the implementers don't know which punctuations are starting/ending punctuations (I don't know Arabic, but for Chinese, I don't know if connector marks or interpuncts are considered "ending punctuations" or not). Although there are some data in UAX #14 and CLDR/ICU, these data are not necessarily accurate, and we can make them clearer in the requirements.

from alreq.

r12a avatar r12a commented on June 10, 2024

I wonder if if Arabic/Persian has something similar. If so, I think we should document them (perhaps in ยง 4.1 Line breaking, see similar sections in clreq and jlreq).

Indeed, that's one of the more obvious sections for which the task force didn't yet provide detail.

By the way, should we document requirements in other languages using the Arabic script? For example, Arabic-derived Uyghur/Uighur requires marking of all vowels and uses hyphenation, which is different from Arabic and Persian.

Certainly, but not in this document, whose scope was limited in the group charter to Arabic and Persian because they were similar and the group participants were not familiar with Uighur.

I'd certainly be interested in getting hold of a copy (in English) of the standard you mentioned, so that we can apply that information it contains to our language enablement program.

from alreq.

r12a avatar r12a commented on June 10, 2024

Actually, what needs to be said here is a little more complicated than listing characters that should or shouldn't appear at one particular end of a line. Fwiw, at https://r12a.github.io/scripts/arabic/#linebreak_props you can find a list of the default Unicode line-break properties for the list of (non-ASCII) characters that i think are needed for Arabic (not Persian) language support (slightly different from the list in alreq, which was more closely tied to CLDR). It's possible that tailoring needs to be applied to the list for Arabic language text.

from alreq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.