GithubHelp home page GithubHelp logo

janlelis / uniscribe Goto Github PK

View Code? Open in Web Editor NEW
280.0 13.0 6.0 363 KB

Know your Unicode ✀

Home Page: https://character.construction

License: MIT License

Ruby 100.00%
unicode codepoints characters glyphs cli-command ruby-cli debugging-tool

uniscribe's Introduction

uniscribe | Describe the Unicode [version] [ci]

Describes Unicode characters with their name and shows compositions. UNICODE 15.1*

  • Helps you understand how glyphs and codepoints are structured within the data
  • Gives you the names of glyphs and codepoints, which can be used for further research
  • Highlights invalid/special/blank codepoints

Uses a similar color coding like its lower-level companion tool unibits.

Setup

Make sure you have Ruby installed and installing gems works properly. Then do:

$ gem install uniscribe

Usage

Pass the string to debug to uniscribe:

From CLI

$ uniscribe "test strı̈ng"

From Ruby

require "uniscribe/kernel_method"
uniscribe "test strı̈ng"

Output


0074 ├─ t		├─ LATIN SMALL LETTER T
0065 ├─ e		├─ LATIN SMALL LETTER E
0073 ├─ s		├─ LATIN SMALL LETTER S
0074 ├─ t		├─ LATIN SMALL LETTER T
0020 ├─ ] [		├─ SPACE
0073 ├─ s		├─ LATIN SMALL LETTER S
0074 ├─ t		├─ LATIN SMALL LETTER T
0072 ├─ r		├─ LATIN SMALL LETTER R
---- ├┬ ı̈		├┬ Composition
0131 │├─ ı		│├─ LATIN SMALL LETTER DOTLESS I
0308 │└─ ◌̈		│└─ COMBINING DIAERESIS
006E ├─ n		├─ LATIN SMALL LETTER N
0067 ├─ g		├─ LATIN SMALL LETTER G

Examples

Tamil

>> uniscribe "நகரத்தில்"

Screenshot Tamil

Thai

>> uniscribe "ม้าลายหกตัว"

Screenshot Thai

Ideographic Variations

>> uniscribe "辻󠄀㚑󠄁"

Screenshot Ideographic Variations

(the variation is not visible in the screenshot, because my system does not render it correctly)

Emoji Sequences

>> uniscribe "3️⃣🤸‍♀"

Screenshot Emoji

Lots of Combining Marks

>> uniscribe "̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍"

Screenshot Marks

Random Sequences of some Special Unicode Codepoints

>> uniscribe "\0A\u{E01D7}\x7F\r\n\u{D0000}\u{81}\u{FFF9}B\u{FFFB}🏴\u{E0061}\u{E007F}\u{10FFFF}"

Screenshot Strange

Some Blanks

>> uniscribe "­ᅠ 𝅸"

Screenshot Blanks

*Notes

Although the gem is generally up to date with Unicode 15.0, the proper detection of compositions / graphemes / combined characters depends on your Ruby version:

You can run uniscribe -v to check for the Unicode level of your uniscribe version.

Also see

Copyright (C) 2017-2023 Jan Lelis https://janlelis.com. Released under the MIT license.

uniscribe's People

Contributors

ctrlcctrlv avatar janlelis avatar mathiasbynens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

uniscribe's Issues

How are compositions detected?

Hey!

This is an amazing tool! I'm wondering how I could detect if a character is composited in my own code and treat it as a single character, not as multiple ones. I found a few rules, but there keep being new ones. Is there a general-purpose way of finding out?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.