rehypejs / rehype-slug Goto Github PK
View Code? Open in Web Editor NEWplugin to add `id` attributes to headings
Home Page: https://unifiedjs.com
License: MIT License
plugin to add `id` attributes to headings
Home Page: https://unifiedjs.com
License: MIT License
Currently, there's no way to manually specify a header's ID. This means that, while drafting an article in MD (to then be passed into rehype), if I want to update a header's text, I must then go and update many of the links into that section.
Ideally, we can support the same Title {#id}
syntax as gatsby-remark-autolink-headers
and others. This will allow us to fix the problems from above.
Luckily, said plugin uses Remark under-the-hood and porting the code is fairly trivial (I've already done so w/ tests, just awaiting a PR in case this idea is received well).
If we port that code, we should also get support for these OOTB:
maintainCase
: Boolean. Maintains the case for markdown header (optional)removeAccents
: Boolean. Remove accents from generated headings IDs (optional)enableCustomId
: Boolean. Enable custom header IDs with {#id} (optional)This idea may be out-of-scope for this project. We may want to defer to the ecosystem to solve this problem, which I understand entirely
There is no way to control to which headings an id will be added.
Enable an option to customize which headings are included (defaults to h1 - h6), like how toc handles it:
rehype()
.data('settings', {fragment: true})
.use(slug, {
headings: ["h2", "h3", "h4", "h5", "h6"], // Exclude adding id to <h1>,
})
.process(buf)
.then((file) => {
console.log(String(file))
})
The option could also be added to rehype-autolink-headings, but if done here it will work on autolink as it will not autolink headings without ids.
I have an issue with german "umlaute" like ü, ö and ä, ẞ in slugs, the ids added to headings includes those. To have less problems with browser support they should be translated into ae, oe, ue or ss.
I thought it would be useful to be able to add a prefix to the generated slugs.
Had the case where on the same page there were multiple instances of rehype and I needed to make sure each slug generated, was unique throughout the page and prefixing them with an id related to the instance solved the problem.
Since I needed this feature, I've made a fork: 82d837c
I'm not sure there any alternatives, maybe adding some hidden content to the headings? Correct me if I'm wrong..
5.0.1
https://codesandbox.io/embed/infallible-andras-d4td0g?fontsize=14&hidenavigation=1&theme=dark
import {rehype} from 'rehype'
import rehypeSlug from 'rehype-slug';
import toc from '@jsdevtools/rehype-toc';
const sourceHtml = `
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>sandbox document</title>
</head>
<body>
<h1>title efé</h1>
<h2>title efé</h2>
<p>lorem ipsum</p>
</body>
</html>
`
async function main() {
document.querySelector('#source').textContent = sourceHtml
rehype()
.use(rehypeSlug)
.use(toc)
.process(sourceHtml)
.then((file) => {
document.querySelector('#result').textContent = String(file)
})
}
main().catch((error) => {
document.querySelector('#error').textContent = error
})
Normalize the extracted header to avoid it having accent (that can not be handled properly).
Id is not normalized and cause broken anchor in association with rehype-toc.
Node v17, Node v16
npm 8, npm 7
Linux
Next.js
Unlike many other plugins in the unified/remark/rehype ecosystem this one unfortunately doesn't provide Typescript typings. It would be nice if they could be provided, even though they will end up pretty short with only one exported function.
Sometimes we use no-break spaces (https://www.fileformat.info/info/unicode/char/00a0/index.htm), \u00A0
to prevent widow/orphan words.
Currently, GitHub headings slug calculation algorithm will not treat a "raw" no-break space as a whitespace (proof) and this package matches GitHub's algorithm.
This is not ideal.
What if we introduced a new option (disabled by default) which instructs rehype-slug
to treat "raw" no-break spaces \u00A0
as whitespace, yielding dashes in the output?
Unit test:
rehype()
.data('settings', {fragment: true})
.use(rehypeSlug, {
prefix: 'test-',
treatNbspAsSpace: true
})
.process(
[
'<section>',
' <h1>Lorem Ipsum Dolor Sit\u00A0Amet</h1>',
' <h2>dolor—sit—amet</h2>',
' <h3>consectetur & adipisicing</h3>',
' <h4>elit</h4>',
' <h5>elit</h5>',
' <p>sed</p>',
'</section>'
].join('\n'),
(error, file) => {
t.ifErr(error, 'shouldn’t throw')
t.equal(
String(file),
[
'<section>',
' <h1 id="test-lorem-ipsum-dolor-sit-amet">Lorem Ipsum Dolor Sit\u00A0Amet</h1>',
' <h2 id="test-dolorsitamet">dolor—sit—amet</h2>',
' <h3 id="test-consectetur--adipisicing">consectetur & adipisicing</h3>',
' <h4 id="test-elit">elit</h4>',
' <h5 id="test-elit-1">elit</h5>',
' <p>sed</p>',
'</section>'
].join('\n'),
'should match'
)
}
)
Not really. We can keep status quo but that makes the life harder when trying to tackle the orphan word control in MDX.
While github-slugger
is one of the most popular (if not THE most popular) slugging functions, there may be the case where someone wants to use an alternate implemenation such asslugify
(or even their own custom slug generator).
Add a new property to the options
object to optionally support sending in your own slug
function. If not slug
option is passed in, default to the current use of github-slugger
I'm certain there are a number of other approaches here (even creating an alternate plugin), but happy to contribute a PR to add in this option!
Currently the only option provided by the plugin is a prefix
. It would be nice to add further options so slugs can be written in capital letters. It would be also cool if a transformer could be defined to modify the generated slugs yourself.
Adding a "capitalize" option or adding a callback where developers can defined their own slug customizations.
Only by writing your own slug plugin.
In #5 I forgot to add register the types in the package.json
which has gone unnoticed until I tried to install the newly published package. I will be creating a PR to fix this issue in a few moments.
Note: Not sure if this is better solved here or in github-slugger. I decided to open here since I think a solution to this problem would make that package stray from it's intention of being as close to GitHub's slugger as possible. Happy to move over to that repo if you think it's a better fit there though.
I have a scenario where an author of some content has started the header with a number. This results in an id
that starts with a number, which isn't valid according to the CSS spec (HTML spec was a little less clear whether it was valid or not). Regardless, when I tried to reference the header using document.querySelector
, it throws an error saying I'm using an invalid selector.
A solution I could take on my end would be to append something to the beginning of these IDs and update the links, but I thought it'd be better to be done upstream (either in rehype-slug
or github-slugger
).
I created a quick solution for github-slugger
where it just appends an underscore (_
) to any id that'd start with a number. I figured I should reach out first before submitting a PR though. GitHub appends user-content-
to all of the ids of in the rendered markdown. You can see this when inspecting the README of this project.
This is where I'm unsure of where this solution should be implemented. GitHub doesn't include the user-content-
in the url produced for each header, but instead handles that client side. So github-slugger
only sluggifies what comes after the user-content-
. So if we start appending something like an underscore, it means that package may deviate too much from its goal.
For those reasons, I think it may be the responsibility of this package.
I touch on the alternatives in the Solutions section above, but I'll reiterate and summarize here:
rehype-slug
as I think this could be a problem for many people (not just me)github-slugger
if it doesn't force that project to deviate from it's goal of "emulate the way GitHub handles generating markdown heading anchors as close as possible."A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.