GithubHelp home page GithubHelp logo

Should use template instead of document.implementation.createHTMLDocument to avoid the web-component registry of the owner doc being reused about dompurify HOT 17 CLOSED

cure53 avatar cure53 commented on July 24, 2024
Should use template instead of document.implementation.createHTMLDocument to avoid the web-component registry of the owner doc being reused

from dompurify.

Comments (17)

cure53 avatar cure53 commented on July 24, 2024

Hi Andrew,

whoa, that is an extremely interesting and valuable concern, thanks a lot for raising that! I think, moving away from having the result of calling document.implementation.createHTMLDocument being "fresh" is a bad idea (at least from our perspective) - and honestly I didn't see that coming.

So, in essence, we'd have to check, whether the template element exists in a DOM (i.e. it wouldn't for IE10/11 and likely will never) and then make a decision as to whether we use the classic or the novel way to generate a fresh document? Your implication about security issues is 100% correct. If element behavior can be tweaked beforehand and if that document with those tweaks is used for sanitization then bugs and bypasses will happen.

I will have a closer look at this tomorrow, thanks again for filing that bug!

Cheers,
.mario

from dompurify.

asutherland avatar asutherland commented on July 24, 2024

Yes, this violated my mental model of document.implementation.createHTMLDocument too. (We used to use it in our HTML sanitizer https://github.com/mozilla-b2g/bleach.js for the Firefox OS email app, but when we changed to doing the sanitization on the worker thread we lost access to any type of DOM.)

I agree you will need to do some type of dynamic check and that it sucks to have to do that.

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

I am not entirely sure yet - should we go and populate the document resulting from calling this code?

var doc = document.createElement('template').content.ownerDocument;

E.g. by giving it a body or similar. Or should we rather use this document as a foundation to call upon doc.implementation.createHTMLDocument? I would feel more comfortable with the latter - yet am not sure if this doesn't backfire and revives the element registry again. It shouldn't from what I understand.

from dompurify.

asutherland avatar asutherland commented on July 24, 2024

I don't think there's any harm in creating a second document, and indeed, there could be some benefits, at least based on how Gecko implements this.

Specifically, I see:

And in GetTemplateContentsOwner() I'm not really seeing anything that would protect you against other code doing dumb things inside that document. If there was super-aggressive dumb code registering custom elements in there (maybe there's protection?), I could see creating a fresh document with createHTMLDocument() to get a fresh TemplateContentsOwner document, then use that created document. I guess you could create another document inside that too for extra paranoia.

Note that if the cached template document owner allows custom elements to be registered against it, that's probably a bug in Gecko that would be fixed.

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

Interesting! So, in theory, the fix to make sure DOMPurify is "future-safe" would look like this, correct? We feature-test against the template element and then make a decision as to how we create the freshdom.

if(typeof HTMLTemplateElement === 'function'){
    var freshdom = document.createElement('template').content.ownerDocument.
        implementation.createHTMLDocument();
} else {
    var freshdom = document.implementation.createHTMLDocument('');
}

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

I added a branch for that. The tests say yay so it seems to be okay. I would however love to set up a test case that actually does register new DOM elements and check, if the DOM we create is untouched nevertheless.

5171448?diff=unified#diff-9d220e0bdde67c89de82589b198242ccR222

from dompurify.

fhemberger avatar fhemberger commented on July 24, 2024

Could be simplified as:

var doc = (typeof HTMLTemplateElement === 'function') ?
    document.createElement('template').content.ownerDocument :
    document;
var freshdom = doc.implementation.createHTMLDocument('');

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

@asutherland So, this test-case, working on Chrome, pretty much proves that you are 100% right, we can indeed observe that the alleged fresh-DOM is not fresh anymore:

<html>
<body>
<script>
var XFooProto = Object.create(HTMLElement.prototype);
var XFoo = document.registerElement('x-foo', {prototype: XFooProto});
var xfoo = document.createElement('x-foo');

var doc1 = document.implementation.createHTMLDocument('');
doc1.body.innerHTML='<x-foo></x-foo>';
alert(doc1.body.firstElementChild.constructor);
</script>
</body>
</html>

When we create the DOM from a template element however, we can observe the freshness indeed:

<html>
<body>
<script>
var XFooProto = Object.create(HTMLElement.prototype);
var XFoo = document.registerElement('x-foo', {prototype: XFooProto});
var xfoo = document.createElement('x-foo');

var doc2  = document.createElement('template').content.ownerDocument.implementation.createHTMLDocument();
doc2.body.innerHTML='<x-foo></x-foo>';
alert(doc2.body.firstElementChild.constructor)
</script>
</body>
</html>

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

@fhemberger Changed that, thx! I am ready to merge - any objections?

from dompurify.

fhemberger avatar fhemberger commented on July 24, 2024

@cure53 Fire it up! 🔥

from dompurify.

asutherland avatar asutherland commented on July 24, 2024

Not sure which you meant by future proof, but to clarify my Gecko code-diving expedition, I'm a little concerned that something like this could happen:

document.createElement('template').content.ownerDocument.registerElement(...)

and it will result in the Gecko document doing weird things. However, this assumes 1) an almost malicious intent on the part of other code running in your same execution context and 2) that Gecko doesn't already prevent against this, or won't be fixed in the future to prevent this.

In any event, the proposed changes seem great and kudos on aggressively investigating and fixing this! This is one of those "yay open source!" moments I am very happy to be involved in.

from dompurify.

asutherland avatar asutherland commented on July 24, 2024

Clarifying a little further, from the code, I expect that given:

function gimmeTemplateDoc(doc) {
  return doc.createElement('template').content.ownerDocument;
}

then if (doc1 === doc2) then (gimmeTemplateDoc(doc1) === gimmeTemplateDoc(doc2)). (But only for Gecko.)

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

@asutherland But wouldn't that mean that this would have to evaluate to true to be a risk?

document.createElement('template').content.ownerDocument.
    implementation.createHTMLDocument('') 
=== 
document.createElement('template').content.ownerDocument.
    implementation.createHTMLDocument('')

It doesn't right now and I am quite happy about that ;) With future-safe I meant, to be safe for when web components are actually mature enough to enjoy large scale browser support and app-adoption. Which is not the case yet. But will be.

Or, a different test case (if I didn't get it wrong):

<html>
<body>
<script>
// create a doc from template, then register an element in that
var XFooProto = Object.create(HTMLElement.prototype);
var doc1  = document.createElement('template').content.ownerDocument.implementation.createHTMLDocument();
var XFoo = doc1.registerElement('x-foo', {prototype: XFooProto});
var xfoo = doc1.createElement('x-foo');

// now again create a doc from template, check if it inherited...
var doc2  = document.createElement('template').content.ownerDocument.implementation.createHTMLDocument();
doc2.body.innerHTML='<x-foo></x-foo>';
alert(doc2.body.firstElementChild.constructor)
</script>
</body>
</html>

And thanks for the kudos :)

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

If no one objects, I'll close this as fixed.

from dompurify.

asutherland avatar asutherland commented on July 24, 2024

I think I somehow missed your pre-merge comment. Apologies. Thankfully I'm horrible at closing tabs!

The specific badness I was thinking of is if you take your different test case and alter it so doc1 is just:

var doc1 = document.createElement('template').content.ownerDocument;

losing the implementation.createHTMLDocument(); part of it. But at least on current Firefox nightly when doing registerElement both documents throw with NotSupportedError (both just content.ownerDocument of the template and then when also creating another dynamic doc with ownerDocument.implementation.createHTMLDocument(). So it seems safe right now. And the malicious shenanigans required for this would need to be happening inside your trust boundary, which I don't think is part of your threat model (or most reasonable thread models).

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

Great, thanks :)

from dompurify.

neilj avatar neilj commented on July 24, 2024

Sorry if I've misunderstood something, but looking at the code I don't think this fix is correct. You are only getting the fresh document (via the template tag) when something has clobbered the body attribute, and it is only used to get a fresh copy of getElementsByTagName. But if I've understood the issue correctly, we should be using the fresh document as the creator of our new document to parse the dirty markup, so that the new document doesn't inherit the element registry.

Is that right? If so, I'll make a pull request to fix this up.

from dompurify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.