GithubHelp home page GithubHelp logo

janlelis / unibits Goto Github PK

View Code? Open in Web Editor NEW
127.0 10.0 3.0 1.54 MB

Visualize different Unicode encodings in the terminal

Home Page: https://character.construction

License: MIT License

Ruby 100.00%
unicode ascii codepoints terminal debugging-tool utf-8 utf-16 utf-32 cli-command ruby-cli

unibits's Introduction

unibits | Reveal the Unicode [version] [ci]

Ruby library and CLI command that visualizes various Unicode and ASCII/single byte encodings in the terminal:

  • Makes analyzing encodings easier
  • Helps you with debugging strings
  • Highlights invalid/special/blank bytes/characters/codepoints
  • Supports UTF-8, UTF-16LE/UTF-16BE, UTF-32LE/UTF-32BE, ISO-8859-X, Windows-125X, IBMX, CP85X, macX, TIS-620/Windows-874, KOI8-R/KOI8-U, 7-Bit ASCII/GB1988, and arbitrary BINARY data

Color Coding

Each byte of the given string is highlighted using the following mechanism (characters -> codepoints):

  • Red for invalid bytes
  • Light blue for blanks
  • Blue for control characters
  • Non-control formatting characters in pink
  • Green for marks (Unicode only)
  • Orange for unassigned codepoints
  • Lighter orange for unassigned codepoints which are also ignorable
  • Random color for all other codepoints

The same colors are used in the higher-level companion tool uniscribe.

Setup

Make sure you have Ruby installed and installing gems works properly. Then do:

$ gem install unibits

Usage

Pass the string to debug to unibits:

From CLI

$ unibits "🌫 Idiosyncrätic ℜսᖯʏ"

From Ruby

require 'unibits/kernel_method'
unibits "🌫 Idiosyncrätic ℜսᖯʏ"

Advanced Options

unibits takes some optional options:

  • encoding (e): The encoding of the given string (uses the string's default encoding if none given)
  • convert (c): An encoding the string should be converted to before visualizing it
  • stats: Whether to show a short stats header (default: true), you can deactivate on the CLI with --no-stats
  • wide-ambiguous: Treat characters of ambiguous width as 2 spaces instead of 1 (more info)
  • width (w): Set a custom column width, if not set, unibits will retrieve it from the terminal or just use 80

Examples of Valid Encodings

UTF-8

CLI: $ unibits -e utf-8 -c utf-8 "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'utf-8', convert: 'utf-8'

Screenshot UTF-8

UTF-16LE

CLI: $ unibits -e utf-8 -c utf-16le "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'utf-8', convert: 'utf-16le'

Screenshot UTF-16LE

UTF-32BE

CLI: $ unibits -e utf-8 -c utf-32be "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'utf-8', convert: 'utf-32be'

Screenshot UTF-32BE

BINARY

CLI: $ unibits -e binary "🌫 Idiosyncrätic ℜսᖯʏ"

Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'binary'

Screenshot BINARY

ASCII

CLI: $ unibits -e utf-8 -c ascii "ascii"

Ruby: unibits "ascii", encoding: 'utf-8', convert: 'ascii'

Screenshot ASCII

Examples of Invalid Encodings

UTF-8

Example in Ruby: unibits "unexpected \x80 | not enough \xF0\x9F\x8C | overlong \xE0\x81\x81 | surrogate \xED\xA0\x80 | too large \xF5\x8F\xBF\xBF"

Screenshot invalid UTF-8

ASCII

Example in Ruby: unibits "🌫 Idiosyncrätic ℜսᖯʏ", encoding: 'ascii'

Screenshot invalid ASCII

Notes

More info

Related gems

Lots of thanks to @damienklinnert for the motivation and inspiration required to build this! 🎆

Copyright (C) 2017-2023 Jan Lelis https://janlelis.com. Released under the MIT license.

unibits's People

Contributors

janlelis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

drsharp sahwar

unibits's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.