Comments (27)
Yes, 100% agree. That's why we have the HTTPLeaks project to first enumerate all leaks and in a later stage (probably via hook) act on them.
My major concern is style. Namely, to allow styles but prevent leaks. Any ideas on that are very welcome. The CSSOM in MSIE<=10 is a mess and makes it really hard.
from dompurify.
How about just disabling styles if this option is set?
from dompurify.
That could be done in a hook :) Happy to accept a PR with a demo hook that covers that!
from dompurify.
Putting the idea out there: I'd love to be able to proxy all requests that the page will make (to hide the user's IP address, and to ensure that everything is served over https:).
from dompurify.
That should be doable using the data from HTTPLeaks and by building the rules derived from that project into a hook.
We can even save on many of those leaks as they would only work in older browsers or legacy documents modes (MSIE). Am happy to accept a pull request for a hook initiating that project. Anyone here interested in starting off with this? I'd be happy to help and contribute.
from dompurify.
OK, I'd love to get your opinion on API design and "how much should DOMPurify do to help?"
There are 5 classes of problems (See below), of which we only have to worry about the first 3, fetches/navigation and styles.
For 1 and 2, I'd love to be able to use the existing hooks, and extend them with some meta-data that allows me to make sane decisions as a programmer:
DOMPurify.addHook("uponSanitizeAttribute", function (node, data) {
if (data.attrCanBeFetched) {
data.keepAttr = false;
} else if (data.attrCanNavigate) {
data.attrValue = "https://example.com/redirect/" + encodeURIComponent(data.attrValue);
node.addAttribute("target", "_blank");
}
});
For style/fill/mask, it's significantly more complicated, as someone has to parse out the URL. I'd love if that was DOMPurify, as I'm scared of parser-differential attacks.
DOMPurify.addHook("uponSanitizeUrl", function (node, data) {
if (data.isStylesheet) { // @import external stylesheet.
data.keepUrl = false;
} else { // fonts, images, masks, fills, etc.
data.url = "https://example.com/proxy/" + encodeURIComponent(data.url);
}
});
The other question I had for you is what external resources can make further requests once they've been loaded:
stylesheets
... ?
The 5 categories of leaks:
-
URL attributes of HTML elements that cause fetches.
a ping=
img src=
body background= -
URL attributes of HTML elements that cause navigation.
a href=
area href= -
Attributes/contents of HTML elements that may contain urls that cause fetches.
svg mask=
div style=
style -
Things that DOMPurify already gets rid of
link tags
meta tags
base
iframe -
Things that don't work in an HTML5 document in modern browsers (we can ignore this category?)
xml entities
xslt
processing directives
from dompurify.
I added a new branch, HTTPLeaks, and created a small smoke-test. The network traffic tab shows, what goes through after a default-config sanitization.
https://github.com/cure53/DOMPurify/blob/HTTPLeaks/demos/hooks-proxy-demo.html
Thanks for collecting those categories, nothing to add from my side for now. My biggest concern is still CSS. And SVG. Both have fairly different DOM implementations, hard to find a way that suits all browsers.
from dompurify.
This revision is actually working quite well with inline CSS, so it allows style
-attributes. Sadly, on MSIE, there's still a bug based on the CSSOM. CSS property values are double-proxied (specifically background-image
).
from dompurify.
I think I managed to fix the IE problem
from dompurify.
Now, that it seems we have inline style under control (not hardened yet of course) we can think about the next steps. On the list would be:
- SVG leaks via attributes
- SVG leaks via CSS
- Style elements, the big nemesis
For style elements, I was thinking the following:
if the user controlled HTML contains a style element. Would it be making sense to first permit it, then get each element's CSS using getComputedStyle()
, then inline the styles and sanitize them and finally remove the style element? Or would that be "too much magic" for common use cases?
from dompurify.
Nice work!
I think with styles, we'll want to avoid inlining; because that breaks hover selectors.
I've sent a pull request with an approach that tries to sanitize the stylesheet directly. It works in Chrome/Firefox/Safari, and I'm downloading some IEs now.
That approach still needs some work to support media-queries and font-face declarations.
Conrad
from dompurify.
Nice - thanks!
I took it and rebuilt it a bit in here: bad350d
The changes:
- completely removed support for [at]import because of XSS risks
- No support yet for other @-rules (XSS, leakage risks, we need to evaluate first)
- added support for all rules that have a selector
- made it work on FF, Chrome and IE10, IE11 as well as Edge
- added CSS sanitization in a sense that invalid rules are omitted
- empty style elements/attributes are deleted
- in essence, all CSS is rewritten and re-added
The magic is currently happening here and here:
if (data.tagName === 'style') {
var output = [];
for (var index=node.sheet.cssRules.length-1; index>=0; index--) {
var rule = node.sheet.cssRules[index];
// we only accept rules with selectors, no @-rules (yet)
if(rule.selectorText){
// we re-write each CSS rule to remove invalid CSS
output.push(rule.selectorText + '{')
if(rule.style && rule.type === 1) {
var styles = rule.style
for (var prop=styles.length-1; prop>=0; prop--) {
if(styles[styles[prop]]){
var url = styles[styles[prop]].replace(regex, '$1'+proxy);
styles[styles[prop]]=url;
}
output.push(styles[prop]+':'+styles[styles[prop]]+';')
}
}
output.push('}');
}
}
// re-add CSS if any is left
if(output.length) {
node.textContent = output.join("\n");
} else {
node.parentNode.removeChild(node);
}
}
// check all style attribute values and proxy them
if(node.hasAttribute('style')){
var styles = node.style;
var output = [];
for(var prop=styles.length-1; prop>=0; prop--) {
// we re-write each property-value pair to remove invalid CSS
if(node.style[styles[prop]] && regex.test(node.style[styles[prop]])) {
var url = node.style[styles[prop]].replace(regex, '$1'+proxy)
node.style[styles[prop]]=url;
}
output.push(styles[prop]+':'+node.style[styles[prop]]+';');
}
// re-add styles in case any are left
if(output.length) {
node.setAttribute('style', output.join(""));
} else {
node.removeAttribute('style');
}
}
The code essentially iterates over the existing rules and checks their type. If the type is valid and safe, we go over all properties and re-write the risky/leaky ones. The same appens for elements as well as for attributes.
from dompurify.
Sadly, this is a working bypass on MSIE and Spartan/Edge:
<p style="font-family: 'test\27\22\3b background:url(//evil.com)\3b\2f\2f'"></p>
<style>p {
font-family: 'test\27\22\3b background:url(//evil.com)\3b\2f\2f'
}</style>
Using SCT on MSIE10 and 11, we might be able to turn this into a full blown XSS, working on a solution now.
from dompurify.
And more things happened, IE bugs fixed, empty styles fixed, bypass using attr
is fixed.
I think we can now have a look at fonts and other @-rules. And then beautify the code :)
from dompurify.
Awesome work! I wasn't looking forward to IE :).
Will try take a stab at the @-rules later, unless you beat me to it. We also need to tidy up handling of attributes containing data: URLs and #fragments (which should work) and relative URLs (which should probably be stripped, or made-absolute)
from dompurify.
@ConradIrwin Since there was not too much activity on this thread for a bunch of days - should we check if the current level of implementation fits the requirements? Or is there anything else left to do?
from dompurify.
@ConradIrwin Thanks for the PR, I just merged it and did some additional edits. I also added some documentation for the demo hooks, feedback is welcome!
from dompurify.
Nice, looks great!
I think this is pretty much "done" for now, unless we can find other bypasses or missing features. I'm hoping to write an email client in HTML5, so I'd like to pull out the code in that demo file into its own library at some point, would you like that to be a cure53 project? If not I'll just make my own public repo.
from dompurify.
Nah, completely fine to have it in your project. We can always fork if we get greedy :)
What I would want to do as a spin-off is another hook for CSS sanitization. Then I would like to publish a demo for people to break. Then I guess our part is done.
from dompurify.
Anyone against closing this issue? I think we reached the goal with the current set of hooks.
from dompurify.
fine with closing it off, although I still want to vote for making it a default option ;)
from dompurify.
So, is the hook option not working for you? If so - why? What can we do to make it work?
from dompurify.
ohh it works for me fine. But I knew enough to care about the problem. Having it in the default options makes it more likely to be used by more developers, IMO.
from dompurify.
Aye, if you have a hook to share - I'll be happy to accept that PR and add the necessary documentation for it.
from dompurify.
ok .. lets close this out in the meantime! thanks!
from dompurify.
@cure53 I don't think this is actually a problem with the current implementation, but just a curiosity I noticed and thought you'd like to know. If you can think of a way to sneak something past the sanitizer with this, I'd love to hear it :)
If you set a relative background image URL, it is ignored by Chrome (v45) [other browsers are untested], unless the document has a baseURI at the time the style attribute is parsed:
d = document.implementation.createHTMLDocument('')
d.body.setAttribute('style', 'background-image: url(/a.png)')
d.body.style.backgroundImage
//= "url()"
base = d.createElement('base')
base.href = "https://example.com/"
d.documentElement.appendChild(base)
d.body.style.backgroundImage
//= "url()"
d.body.setAttribute('style', 'background-image: url(/a.png)')
d.body.style.backgroundImage
//= "url(https://example.com/a.png)"
d.documentElement.removeChild(base)
d.body.style.backgroundImage
//= "url(https://example.com/a.png)"
d.body.setAttribute('style', 'background-image: url(/a.png)')
d.body.style.backgroundImage
//= "url()"
So, if you purify a document that has no tag, you may not be able to see relative url()s
at sanitization time, but they will re-appear when you use the DOM or HTML in the context of a real document.
from dompurify.
Hehe, as a matter of fact we know that and defend against that already in the CSS sanitizer demo hook :D
Check here:
It took me some time to figure out how to solve it, but the base-element does the trick here.
from dompurify.
Related Issues (20)
- Unknown element should never be considered as void HOT 11
- `>` as text node should not be escaped HOT 4
- Return removed elements after sanitize call HOT 10
- Why attitudes are resorted after purified HOT 11
- How to allow all tags except <script> HOT 6
- Using DOMPurify to render unicode values properly HOT 3
- 3.0.7 not published on npmjs registry HOT 1
- [bug] - Build error in angular app HOT 23
- TypeError thrown by clone when configuration contains null values HOT 1
- Error thrown with `style` tag in `svg` HOT 8
- Question about Sanitize Css Hook example HOT 5
- Question about sanitizing HTML content with WHOLE_DOCUMENT option HOT 3
- Meta/Header Data Strips HOT 2
- [bag] - build error in react app
- Potential for XSS exploit through data uri HOT 7
- +1 (786) 263-2714
- Enhancement: Automatic Isolation of Hook Contexts in DOMPurify to Prevent State Leakage Between Sanitizations HOT 2
- Questions about what exactly is in the default configuration? HOT 1
- [bug] Breaking changes with tag matching (_isBasicCustomElement) in 3.0.10 HOT 3
- Release assets bug 3.0.11 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dompurify.