GithubHelp home page GithubHelp logo

grammaticus's Introduction

Build Status

Grammaticus is a grammar engine that allows users to rename nouns while keeping content grammatically correct.

Why did we build Grammaticus?

At Salesforce, we have a feature called "Rename Tabs & Labels" which lets administrators change the name of standard parts of our product (like "Account"). However, the application often wants to display this label as part of a phrase, like Open an Account. But, if you renamed Account to Client, it would look both strange and grammatically incorrect: Open an Client. To support making these kinds of translations integrate naturally into an application, we developed a custom label file format. To ease the burden on translators (and the use of memory for translation), the label file format is XML, split into sections and keys. We use XML entities to represent the nouns, adjectives, and articles, such as Open <a/> <Account/> for the label above.

Grammaticus prevents your application from feeling foreign, and allows the expansion of your application to nouns defined by your customers. Salesforce extensively uses this feature with Custom Objects, allowing standard screens to say All My Puppies through the label <All/> <My/> <Entity entity="0"/>. This also includes use of plural rules to correctly handle Create 1 house vs Create 2 houses. Such as this example:

<param name="num_records_entity"><plural num="0"><when val="one">There is {0} <entity entity="0"/></when>There are {0} <entities entity="0"/></plural></param>

Grammaticus encodes the article, noun, and adjective declensions for over 30 languages, and supports programmatic use of nouns through the Renameable interface. The default label files included in /src/test provide a set of adjectives and articles already translated by Salesforce, along with some sample nouns.

For use in a browser or node.js, there is a beta offline engine that runs in javascript. The grammaticus.js file contains a module that has base grammar rules that each declension then overrides. The label files can be transferred to the client as json and cached there. A future version will have automatic integration with webcomponents. See OfflineProcessingTest for examples

Disclaimer: This library requires developers and localizers to ensure that names of “renameable nouns” aren’t hard-coded, and that string concatenation for renameable objects isn’t used. It also requires your users to provide information around the nouns they are renaming. This includes gender and various language-specific grammar rules.


The files for translation are split into three different types:

  • names.xml: The dictionary of all the nouns in a given language that your customers are allowed to change in each form for the language.
  • adjectives.xml: The dictionary of all of the adjectives and articles you may need to conjugate for your users.
  • labels.xml (and imports): The labels themselves.

You can load labels from a file system, a jar file, or from a known URL. Some helpful classes around managing IniFiles are included as well for managing and censoring sensitive information from log files.

Some default behaviors, such as the list of supported languages, can be overridden by specifying an i18n.properties file in /com/force/i18n of your jar file. Specifically, you should override the LanguageProvider to return only the set of languages supported by your application.


How to build:

Grammaticus uses Maven for its build lifecycle. Run the following commands to pull the source code and build package/jar in your development environment:

git clone https://github.com/salesforce/grammaticus
cd grammaticus
mvn package

Known Limitations:

  • Verbs are not part of the grammar engine. Semitic languages have inflected verbs based on the gender of the subject, so labels may be grammatically incorrect for labels that change gender. You can fix these issues by using a gender tag, such as <gender><when val="m">MaleVerb</when>FemaleVerb</gender> .
  • Bantu language support (Swahili, Xhosa & Zulu) is in beta.
  • Offline javascript support is in beta
  • Partitive articles are not available.
  • Many incomplete or unsupported declensions are provided for certain languages, because Salesforce doesn't translate into them. See UnsupportedLanguageDeclension.java.
  • US English is the base language. Specifying a different base language is supported, but hasn't been tested.

1.1 Improvements

  • New Languages: Afrikaans, Burmese, Gujarati, Kannada, Malayalam, Maori, Marathi, Swahili, Telugu, Xhosa, Zulu.
  • Beta Offline Javascript label rendering support using grammaticus.js.
  • PluralRules support (Now depends on icu4j) through the <plural/> tag
  • Semitic verb support through the <gender/> tag
  • Classifier/counting word support on nouns with the <counter/> tag.
  • Support of Korean postpositions through endsWith tag, along with some defaults.
  • Dual number support for Arabic and Slovenian.

1.2 Improvements

  • New Languages: Amhartic, Khmer, Samoan, Hawaiian, Kazakh, Haitian Creole
  • Support for ICU in BaseLocalizer for date and number formatting as an optional dependency.
    Include icu4j-localespi as a dependency and call BaseLocalizer.setLocaleFormatFixer(loc->BaseLocalizer.getICUFormatFixer())
  • Support graal.js for javascript testing
  • Reduce logging for invalid labels

grammaticus's People

Contributors

yoikawa avatar steventamm avatar tok-fe-oss avatar ytanida avatar puchen-sfdc avatar dependabot[bot] avatar apoblock avatar boyang-sfdc avatar rpayami avatar manodj-sfdc avatar snyk-bot avatar svc-scm avatar

Stargazers

Taketoday avatar Butters avatar  avatar Zikani Nyirenda Mwase avatar PL4N3T9 avatar  avatar Neos21 avatar Debabrata Shome avatar Krystian Charubin avatar Jonathan Gillespie avatar  avatar Spencer Williams avatar Bryden Wayne avatar 小子欠扁 avatar Alexandre Schmitt avatar Ashwin Jayaprakash avatar  avatar Takahiro KAWABATA avatar Tanmay avatar viro avatar Dan Fernandez avatar Michael Petersen avatar Seth Weiner avatar Igor Justino avatar Daren Darrow avatar  avatar Jonathan McKenzie avatar Karim Ratib avatar Trevor avatar Sejal Parikh avatar  avatar Greg Smith avatar Todd Cook avatar Juan Castañeda avatar Álvaro Formoso avatar Matthew Tovbin avatar Jitendra Vyas avatar Dhruv Gohil avatar Michael Ahlers avatar Behnam Esfahbod avatar Scott Yancey avatar Santhosh avatar Shunsuke Tadokoro avatar Kazuhiro Sera avatar Jay Hankins avatar Steven Yan avatar  avatar Niraj Kumar avatar Alejandro Zuleta avatar Diéffrei Quadros avatar Mike Bartoli avatar Alolita Sharma avatar Jim Zarkadas avatar An Camchéachta avatar Mihai Bojin avatar Caridy Patiño avatar Amy avatar Cedric Moitrier avatar  avatar Alex DelPriore avatar  avatar Alex Krolick avatar

Watchers

Behnam Esfahbod avatar Caridy Patiño avatar  avatar James Cloos avatar Mike Bartoli avatar Tanmay avatar  avatar  avatar  avatar  avatar Leo Saguisag avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.