GithubHelp home page GithubHelp logo

wrap-segments's Introduction

wrap-segments

Wrap lines at Unicode word boundaries, using Intl.Segmenter.

Existing wrapping libraries tend to work very well on plain ASCII-7 text. However, the world has lots of other text that needs to be wrapped.

You might want to turn this:

f̵̩̣̺ö̶̧̧̢o̶̥̩̗̹ ̶̨̢͔̳b̷̧̥͍̥a̷̛̦͓̜r̴̡͕̳̪

into this:

f̵̩̣̺ö̶̧̧̢o̶̥̩̗̹ ̶̨̢͔̳
b̷̧̥͍̥a̷̛̦͓̜r̴̡͕̳̪

by wrapping every 4 grapheme clusters.

Installation

npm install wrap-segments

API

None of the options are required, and you can omit the options entirely to take all of the defaults. The below example shows the default options:

import {SegmentWrapper} from '../lib/index.js'

const w = new SegmentWrapper({
  escape: identityTransform, // Escape inputs before proessing
  indent: '', // Can be a string or number
  indentChar: ' ', // If indent is a number, repeat this that many times
  indentEmpty: false, // If the input is empty, still indent?
  indentFirst: true, // Indent the first line?
  isEmpty: /^\s*$/u, // Is a given text segment empty?  Only applies to non-wordLike segments.
  isNewline: /((?![\r\n\v\f\x85\u2028\u2029])\s)*[\r\n\v\f\x85\u2028\u2029]+(\s*)/gu, // Replace newlines matching this with newlineReplacement
  locale: DEFAULT_LOCALE, // Default is calculated by the JS runtime
  newline: '\n', // Insert this at the end of every line
  newlineReplacement: ' ', // What to replace isNewline with
  trim: true, // Trim whitespace from the end of the input
  width: 80, // In grapheme clusters, *including* indent
})

const wrapped = w.wrap('Lorem Ipsum...')

Generated API documentation is available.

Command line

A CLI is available as a separate package.

Caveats

  • This hasn't been tested with enough languages. Please submit an issue or PR if you speak Korean, a language that uses the Devanagari script, a language that uses a right-to-left script such as Arabic or Hebrew, etc.
  • This does not implement the full line breaking algorithm from Unicode TR14. I'm hoping that the Intl.Segmenter word boundaries are "close enough" for most cases. It's hard to get access to all of the needed properties from the JS runtime without including version-specific Unicode data, which I don't want to do. However, there are some rules in that algorithm that would be worth adding, with some careful thought.

Tests codecov

wrap-segments's People

Contributors

hildjj avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.