Comments (15)
For those, who is looking for a way to use html tags in markdown with remark-parse
, I leave this recipe here. Thanks to @ChristianMurphy for his suggestion. I've made just couple improvements:
- Module
rehype-dom-parse
leads to error:document is not defined
. So I replace it withrehype-parse
. - Extract rehypeParser from handler, so it's created only once.
- Also notice about
sanitize: false
You can try this snippet in console:
var unified = require('unified')
var remark = require('remark-parse')
var remark2react = require('remark-react');
var ReactDOMServer = require('react-dom/server');
var rehype = require('rehype-parse')
const sample = `
markdown is here
<div style="color:gray;">
text <a href="#">link</a>
</div>
`
const rehypeParser = unified().use(rehype, { fragment: true });
const parser = unified()
.use(remark)
.use(remark2react, {
toHast: {
handlers: {
html: (h, node) =>
// process raw HTML text into HAST so react remark can process it
rehypeParser.parse(node.value).children
}
},
sanitize: false
});
const result = parser.processSync(sample)
const html = ReactDOMServer.renderToStaticMarkup(result.contents)
console.log(html)
output:
<p>markdown is here</p>
<div style="color:gray">
text <a href="#">link</a>
</div>
from remark-react.
@Hamms what I've been using is the toHast
option
https://github.com/remarkjs/remark-react#optionstohast
With:
// allow inline html to be rendered as React VDOM
import rehype from 'rehype-dom-parse';
// ....
.use(remarkReact, {
toHast: {
handlers: {
html: (h, node) =>
// process raw HTML text into HAST so react remark can process it
unified()
.use(rehype, { fragment: true })
.parse(node.value).children
}
}
}
)
from remark-react.
I guess I'm still confused about the intent of this library, then.
The readme implies that the purpose is to be able to render markdown safely, with the features of hast-util-sanitize. But as it's built, the only nodes that end up in the tree that gets sent to hast-util-sanitize are either those specifically by remark - which are already safe - or raw
nodes which may contain something unsafe but which are entirely removed by hast-util-sanitize (and also by hast-to-hyperscript), making the existence of sanitization in this project both redundant and misleading.
Is the point of this to provide a MDAST renderer that doesn't allow raw html at all, or is it to provide one that only renders sanitized HTML?
from remark-react.
Hmm, I believe you could also do what @ChristianMurphy suggests, but with a dangerouslySetInnerHTML: {__html: '...'}
prop?
Maybe we need a top-level allowDangerousHTML
option to do this by default?
from remark-react.
It's not clear to me why the allowDangerousHTML
option I'm currently passing via toHast
isn't working. Is it supposed to?
from remark-react.
Looking closer, this doesn't appear to be an issue with toHast
at all, but rather with both the sanitization step and the hast-to-hyperscript
step, both of which strip out the raw nodes generated by mdast-util-to-hast
.
Taking a step back, it's not clear to me what the intent of this library is. For my purposes, I'd like to be able to render markdown with raw html inside, but also to sanitize that content against "dangerous" html like https://github.com/syntax-tree/hast-util-sanitize does. So, given an input of
_some_ <strong>raw</strong> html
I get
<p><em>some</em> <strong>raw</strong> html</p>
But given an input of
_some_ <a onclick="alert('hello')">strong</a> html
I get
<p><em>some</em> <a>raw</a> html</p>
I had assumed that was the point of this library, but most of the functionality of hast-util-sanitize only works on actual hast
nodes like script
and a
; when it encounters the raw
nodes generated by mdast-util-to-hast
it simply removes them rather than sanitizing them.
Am I misunderstanding something, or is it not possible to achieve what I want with this tool?
from remark-react.
@Hamms the sanitization can be configured with https://github.com/remarkjs/remark-react#optionssanitize
from remark-react.
Yes, I'm aware of that. But the problem remains that hast-util-sanitize
does not appear to be capable of sanitizing raw
nodes at all except by eliminating them, meaning that it's not actually capable of sanitizing the input from mdast-util-to-hast
.
from remark-react.
Ahhh, it seems like what I actually want to do is to incorporate https://github.com/syntax-tree/hast-util-raw into my process
from remark-react.
I've confirmed that adding hast-util-raw to the remark-react process makes this work:
diff --git a/index.js b/index.js
index c85e599..6ee8ff8 100644
--- a/index.js
+++ b/index.js
@@ -7,6 +7,8 @@ var sanitize = require('hast-util-sanitize')
var toH = require('hast-to-hyperscript')
var tableCellStyle = require('@mapbox/hast-util-table-cell-style')
+var raw = require('hast-util-raw')
+
var globalReact
var globalCreateElement
var globalFragment
@@ -46,6 +48,10 @@ function react(options) {
var tree = toHAST(node, toHastOptions)
var root
+ if (toHastOptions.allowDangerousHTML) {
+ tree = raw(tree)
+ }
+
if (clean) {
tree = sanitize(tree, scheme)
}
diff --git a/package.json b/package.json
index b6d5f8c..36fbe02 100644
--- a/package.json
+++ b/package.json
@@ -28,6 +28,7 @@
"dependencies": {
"@mapbox/hast-util-table-cell-style": "^0.1.3",
"hast-to-hyperscript": "^6.0.0",
+ "hast-util-raw": "^5.0.0",
"hast-util-sanitize": "^1.0.0",
"mdast-util-to-hast": "^4.0.0"
},
diff --git a/test/index.js b/test/index.js
index d6cb228..0e83a8c 100644
--- a/test/index.js
+++ b/test/index.js
@@ -121,6 +121,33 @@ versions.forEach(function(reactVersion) {
'passes toHast options to inner toHAST() function'
)
+ t.equal(
+ React.renderToStaticMarkup(
+ remark()
+ .use(reactRenderer, {
+ createElement: React.createElement,
+ toHast: {allowDangerousHTML: true}
+ })
+ .processSync('<strong>raw</strong> html').contents
+ ),
+ '<p><strong>raw</strong> html</p>',
+ 'renders raw html when specified'
+ )
+
+ t.equal(
+ React.renderToStaticMarkup(
+ remark()
+ .use(reactRenderer, {
+ createElement: React.createElement,
+ toHast: {allowDangerousHTML: true}
+ })
+ .processSync('<a onclick="alert("charlie")">delta</a>')
+ .contents
+ ),
+ '<p><a>delta</a></p>',
+ 'raw html is sanitized'
+ )
+
fixtures.forEach(function(name) {
var base = path.join(root, name)
var input = fs.readFileSync(path.join(base, 'input.md'))
Any objections to me opening a PR with the above change?
from remark-react.
Yes, I do object!
Because including hast-util-raw
(or rehype-raw
) includes a full blown HTML parser. And that’s really heavy on the browser.
I would suggest people that want that to go remark -> remark-rehype -> rehype-raw -> rehype-sanitize -> rehype-react instead.
We could add a note here in the readme, similar to the note in the intro here, though?
from remark-react.
I guess I'm still confused about the intent of this library, then.
Definitely something we should fix!
The readme implies that the purpose is to be able to render markdown safely
True! But also that it doesn’t use .dangerouslySetInnerHTML
, and including raw nodes kinda defeats that purpose.
But as it's built, the only nodes that end up in the tree [...]
And also anything from hName
, hProperties
, hChildren
, which could be anything, so I disagree with “making the existence of sanitization in this project both redundant and misleading”. (Although we should fix the “misleading” part)
Is the point of this to provide a MDAST renderer that doesn't allow raw html at all, or is it to provide one that only renders sanitized HTML?
The point here is to allow a simple markdown to react renderer that is safe. That includes not rendering unsafe HTML. Not sanitising HTML at all would be super unsafe. Not being safe by default would be really bad for XSS and the like.
Raw HTML is an escape hatch for markdown that is inherently unsafe. Except if you really know what you’re doing. In which case you need to include an HTML parser. And then the route is remark -> rehype -> rehype-raw -> rehype-react, which we should definitely document better!
from remark-react.
Definitely something we should fix!
I appreciate it! :)
Can you give me an example of markdown input without raw html that would result in unsafe HTML output? I think that would help me understand the concerns this library is intended to protect against.
from remark-react.
The XSS problems stem from plugin use, for example, this tree: https://github.com/syntax-tree/hast-util-sanitize#usage
from remark-react.
(Or, if the user could write that HTML themselves in Markdown, which would be possible with allowDangerousHTML
)
from remark-react.
Related Issues (20)
- HTML in markdown always getting escaped HOT 2
- Improve documentation around linking to custom React Components HOT 4
- Passing props to remarkReactComponents HOT 4
- hast-util-sanitize dep failing HOT 1
- Pass `position` into props HOT 1
- is "remark-react looks for an attributes object on each node" working? HOT 3
- Any reason why the root element has to be a <div>? HOT 2
- remarkReactComponents target classes HOT 2
- Incorrect treatment of 'tel'-links HOT 2
- Test passing options to toHast() wrong? HOT 1
- Laguage type not being passed HOT 2
- 5.0.0 HOT 13
- Uncaught TypeError: information is not a function HOT 4
- table rendering not working HOT 12
- Nested ordered lists need at least 3 spaces to work HOT 1
- Confusing readme HOT 1
- remarkReactComponents not working HOT 1
- Add Typescript types HOT 3
- Access the data property of hast code elements in my custom component HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from remark-react.