GithubHelp home page GithubHelp logo

Comments (9)

nomeata avatar nomeata commented on July 30, 2024 2

To get GitHub check lists write

* [ ]
* [x]

from motoko-base.

ehoogerbeets avatar ehoogerbeets commented on July 30, 2024

toLower and toUpper are locale dependent, and cannot be performed only on a single character. For example in Greek, the capital sigma Σ lower cases to σ in the middle of a word, but to a different Unicode character ς at the end of the word. In Serbian, the nj ligature upper cases to the character NJ except in title case were it upper-cases to the character Nj. In German, the eszett character ß upper cases to the two characters SS. In English, the small letter i (with the dot) upper-cases to capital letter I (without the dot), but in Turkish, small letter i upper cases to capital letter İ (with a dot!) and the small letter ı (without a dot) upper cases to I (also without a dot).

I would recommend explicitly NOT putting a toLower and toUpper method on characters or even on text. Instead make a case folding class that takes a locale, and has methods for doing toUpper and toLower on whole texts and returns whole texts.

from motoko-base.

MayurSMahajan avatar MayurSMahajan commented on July 30, 2024

Hey @crusso! I would like to contribute to this issue. I am a beginner. I have experience in other programming languages, so I know how I might be able to code the six methods you set as tasks. I am a little confused on the term "needs a prim". I think by table driven you mean we simply need to create a method that looks up at a table to find out matching value and some bare logic rather than complete logic. So can you help me get started.

from motoko-base.

MayurSMahajan avatar MayurSMahajan commented on July 30, 2024

@crusso @nomeata @ehoogerbeets guys, please at least give me a reply I am waiting from 11 days now!

from motoko-base.

chenyan-dfinity avatar chenyan-dfinity commented on July 30, 2024

Hi Mayur. The unicode table for these operations are huge and is evolving with the unicode spec. Putting these tables directly in a library would dramatically increase the code size and memory. A better option is to store the tables in the runtime or a dynamically linked module, and this requires compiler and system level support. It can be a difficult task if you are not familiar with the internals of the compiler and the IC execution environment.

On the other hand, if you want to implement these functions for ascii characters only, that can be a pure library code.

from motoko-base.

MayurSMahajan avatar MayurSMahajan commented on July 30, 2024

Okay @chenyan-dfinity got it. I will work on ascii characters only. Thank you for responding!

from motoko-base.

MayurSMahajan avatar MayurSMahajan commented on July 30, 2024

I checked the Char.mo library and actually the following tasks are already completed:
isNumeric() (implemented as isDigit())
toLower
toUpper
I request @crusso to mark them as done in the original issue.

from motoko-base.

crusso avatar crusso commented on July 30, 2024

@MayurSMahajan Both toLower and toUpper are implemented, but private and not exposed, because a proper unicode implementation would produce multiple chars, not a single char.

IsNumeric() is true for more than just 0..9, so is really a separate function, that, again, needs unicode tables.

On the other hand, isDigit() only returns true for 0..9 - though, interestingly, Rust actually includes 'a-z' and 'A-Z' as digits - perhaps we should too.

from motoko-base.

chenyan-dfinity avatar chenyan-dfinity commented on July 30, 2024

We could expose toLowerAscii and toUpperAscii to support only ascii characters.

from motoko-base.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.