GithubHelp home page GithubHelp logo

tricinel / highlight-words Goto Github PK

View Code? Open in Web Editor NEW
87.0 87.0 6.0 844 KB

Split a piece text into multiple chunks based on a search query, allowing you to highlight the matches afterwards.

License: MIT License

JavaScript 5.92% TypeScript 92.70% Shell 1.38%
javascript typescript

highlight-words's Introduction

Hey there!

Ready to ship accessible websites without long-term contracts, proposals or payroll expenses, for a fixed monthly price?

Book 1:1 ๐Ÿ‘‰ Start shipping

I'm Bogdan!

I'm the accessibility consultant who helps product owners ship accessible websites without blocking ongoing work.

I have over ten years of experience working in the education and healthcare sectors, with expertise in inclusive design, technology and accessibility on the web.

Imagine you didn't have to worry about:

  • โœ… critical accessibility issues in production,
  • โœ… cryptic issues in your backlog,
  • โœ… how long it takes your team to fix the accessibility issues in it,
  • โœ… new accessibility issues a release introduces,
  • โœ… costly redesigns and rework, or
  • โœ… what all the fancy acronyms mean.

I do this through a monthly subscription that allows us to work together based on the level of support your team needs. I belive this is the most effective way to make progress quickly. By eliminating the friction of changing priorities, debating scope and exchanging proposals, contracts and statements of work, I can help you consistently ship an accessible product.

Book 1:1 ๐Ÿ‘‰ Start shipping

Grab a free accessibility resource

  • Accessibility Checklists. A set of free checklists as PDFs for product owners, designers, developers and testers to nudge you towards a more accessible outcome by weaving accessibility into your software development lifecycle.
  • Six Days to an Accessible Website. A free six day email course to teach you how to get the upper-hand on 1 million websites by fixing the most common accessibility issues on yours.

Or use one of my open source packages:

Project Description
Accessibility CSS Reset A style reset specifically aimed at accessibility that embraces modern CSS features to help start your project without accessibility errors
WCAG2.2 Search Alfred Workflow Alfred workflow to search the Web Content Accessibility Guidelines (WCAG).
Highlight Words Split a piece of text into chunks given a search query, by separating matches from non-matches, allowing you to highlight the matches, visually or otherwise, in your app

Languages and Tools

typescript logo javascript logo html logo css logo react logo svelte logo nodejs logo remix logo tailwind logo

GitHub Stats

What I code with

Get in touch

Twitter Badge Linkedin Badge Website Badge

highlight-words's People

Contributors

bertdeblock avatar dependabot[bot] avatar tricinel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

highlight-words's Issues

ESM Support

Hi @tricinel !

Just upgraded to Node.js ESM ("type": "module" in package.json) and highlight-words is breaking ๐Ÿ˜จ

import highlightWords from 'highlight-words';

// ๐Ÿ’ฅ  Error on line below: highlightWords is not a function
highlightWords({
  text: 'The quick brown fox jumped over the lazy dog',
  query: 'over'
});

Workaround

import highlightWords from 'highlight-words';

highlightWords.default({
  text: 'The quick brown fox jumped over the lazy dog',
  query: 'over'
});

New Option: `clipByLength`

Ah I was just thinking more about my use case (more info here: #6), and I was thinking maybe it would help to have an option clipByLength in addition to clipBy, which would clip with an ellipsis not by number of words but by number of characters?

Alternative: keep clipBy for the actual number value, and add an option clipByType with possible values of "words" and "characters" (defaults to "words")

Motivation: With only the clipBy option, users may run into problems with super long "words" such as URLs that don't get clipped reasonably.

Example

const text = 'My dog is a very good boy and is always eating his lunch.';
const chunks = highlightWords({
  text,
  query: 'is',
  clipByLength: 7
});

Value for chunks:

[
  {
    "text": "My dog ",
    "match": false
  },
  {
    "text": "is",
    "match": true
  },
  {
    "text": " a very ... oy and ",
    "match": false
  },
  {
    "text": "is",
    "match": true
  },
  {
    "text": " always ... ating h",
    "match": false
  },
  {
    "text": "is",
    "match": true
  },
  {
    "text": " lunch.",
    "match": false
  }
]

highlight-words libray is giving Math.random vunerbility scan as high

highlight-words library is giving Math.random vulnerability scan as high, can we please fix it soon and get a new version

That probably means you (or a plugin you are using) is using the JS native Math.random() which has been deemed insecure. You can replace those functions with the latest JS crypto stuff here: https://developer.mozilla.org/en-US/docs/Web/API/Crypto/getRandomValues

We are using Material-React-Table, and this is the only high warning we got because highlight-words is used by Material-React-Table. No other libraries used by Material-React-Table gave any warning. That probably means you (or a plugin you are using) is using the JS native Math.random() which has been deemed insecure. You can replace those functions with the latest JS crypto stuff here as a recommendation. What you commented makes sense, but if you can replace Math.random(), it will make our scan clean. We mentioned it only because this is the only warning we got when using Material-React-Table and sometimes companies refuse to use it just because of the warning even though its not anything significant. See the attached screenshot.

2024_01_19_20_33_54_Settings

Investigate potential performance issues

If we have a lot of content to parse, we might incur a performance penalty.

Questions to answer:

  1. Is there a break point after which we start to notice any performance issues with the splitter?
  2. Are there performance issues in the browser only, i.e. when construing the CSSOM or during paint?

Relates to #6

Option to ignore diacritics

Hello, I use the library for a search field and the matching algorithm ignores diacritics (for example apple matches รกpplรจ), but the highlighting generated from this library does not produce a match. An option to ignore diacritics when generating the chunks would be nice, since the input cannot just be stripped as the outputted chunks of course will be stripped as well.

Option Proposal: `maxLength`

Hi there!

First of all, I want to thank you for this library - really nice API and options for splitting text into chunks! Super helpful for a lot of different use cases ๐Ÿ’ฏ

I wanted to test the waters about a potential option maxLength, allowing for aborting the matching process early, improving performance when matching small parts of large text content.

It could work as follows:

const chunks = highlightWords({
  text: 'General Kenobi, years ago you served my father in the Clone Wars. Now he begs you to help him in his struggle against the Empire. I regret that I am unable to present my father's request to you in person, but my ship has fallen under attack and I'm afraid my mission to bring you to Alderaan has failed. I have placed information vital to the survival of the Rebellion into the memory systems of this R2 unit. My father will know how to retrieve it. You must see this droid safely delivered to him on Alderaan. This is our most desperate hour. Help me, Obi-Wan Kenobi, you're my only hope.',
  query: 'o',
  clipBy: 3,
  // Return an array of chunks, with a total string length of 70 characters or less
  maxLength: 70,
});

I would suggest that the algorithm always end on a non-matching chunk, so that there is always context around the matches on either side.


Workaround: I have a workaround that filters after the matching has taken place, but at that point, the performance impact has already been made:

let resultLength = 0;
let resultLengthWithinLimit = true;

const chunks = highlightWords({
  text: content,
  query: searchQuery,
  clipBy: 3,
}).filter((chunk) => {
  if (!resultLengthWithinLimit) return false;

  resultLength += chunk.text.length;

  // Start filtering out chunks at 90 characters (approximately
  // 2 lines at 320px width and 14px font size), after the first
  // non-matching chunk is encountered.
  if (resultLength > 90 && !chunk.match) {
    resultLengthWithinLimit = false;
  }

  return true;
});

Return Text Fragments for linking to highlights?

Hey @tricinel ๐Ÿ‘‹ hope things are good with you!

I wanted to see whether you'd be open to also returning a Text Fragment (blog post) in addition to the chunks matched in a string.

Eg.

  1. Returning the Text Fragment Modules in Web for usage in the URL below, for matching and scrolling to that part of the page:
    https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=Modules%20in%20Web
    Screenshot 2023-10-15 at 22 29 35

  2. Returning the Text Fragment details%20about%20the-,API,-%2C%20see%20Experimenting%20with for usage in the URL below, for matching a specific instance of API in the page:
    https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=details%20about%20the-,API,-%2C%20see%20Experimenting%20with
    Screenshot 2023-10-15 at 22 32 50

Ideas for implementation can be inspired by the doGenerateFragment function in fragment-generation-utils.js from the text-fragments-polyfill package:

https://github.com/GoogleChromeLabs/text-fragments-polyfill//blob/e5252cb6eba768cdc4166b27cbe8080fc67f37c7/src/fragment-generation-utils.js#L145-L243

Declaration file not found with `Node16` module + moduleResolution

Hi @tricinel ๐Ÿ‘‹ Happy new year! Hope you are well.

I have been experimenting with the "module": "Node16" option in tsconfig.json recently and found the declaration file for highlight-words cannot be found when using these options:

Demo on StackBlitz (run yarn tsc in the terminal to get the error):

https://stackblitz.com/edit/node-gwywi8?file=package.json,tsconfig.json,yarn.lock

Screenshot 2023-01-28 at 18 52 31

Screenshot 2023-01-28 at 18 51 53

Error

The error message:

$ yarn tsc
index.ts:1:28 - error TS7016: Could not find a declaration file for module 'highlight-words'. '/home/projects/node-gwywi8/node_modules/highlight-words/dist/highlight-words.mjs' implicitly has an 'any' type.
  Try `npm i --save-dev @types/highlight-words` if it exists or add a new declaration (.d.ts) file containing `declare module 'highlight-words';`

1 import highlightWords from 'highlight-words';
                             ~~~~~~~~~~~~~~~~~


Found 1 error in index.ts:1

error Command failed with exit code 2.

Code

tsconfig.json

{
  "compilerOptions": {
    "module": "node16",
    "moduleResolution": "node16",
    "noEmit": true,
    "strict": true
  },
  "include": ["index.ts"]
}

package.json

{
  "name": "node-starter",
  "version": "0.0.0",
  "type": "module",
  "dependencies": {
    "highlight-words": "^1.2.1",
    "typescript": "^4.9.4"
  }
}

index.ts

import highlightWords from 'highlight-words';

console.log(highlightWords);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.