GithubHelp home page GithubHelp logo

Comments (8)

UltCombo avatar UltCombo commented on July 24, 2024

Override/extend the _sanitizeElements doesn't seem very good IMO.

Firstly because replacing it with our own code would not be future-proof -- it would break when upgrading to a newer DOMPurify version where that function handles more options/edge cases.

Secondly, extending it like:

var _sanitizeElements = DOMPurify._sanitizeElements;
DOMPurify._sanitizeElements = function(currentNode) {
    //do stuff before _sanitizeElements
    var ret = _sanitizeElements.apply(this, arguments);
    //do stuff after _sanitizeElements
    return ret;
};

Is not useful for the use case of keeping the text nodes of removed elements, because the node removal happens inside of that method.

Maybe if you can abstract this node removal into a _removeNode API which we can override, then we should be able to move the children nodes out of the element being removed. I'm not sure how well that binds with the rest of the library though, it would be more future-proof to have a tested option for that behavior.

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

Hm, it's not easy. Let's get back to the original requirement: A user wished to remove links but keep the text. In this particular situation, I would recommend to simply remove the href attributes, done. This covers anchors wrapping normal text and well as anchors wrapping complex rich-text.

Before we dive into creating APIs and tweak the core: What other use-cases could be there? When else would one want to remove the tag but keep the text?

from dompurify.

UltCombo avatar UltCombo commented on July 24, 2024

Well, let's take this Github Markdown editor for instance.

If you input <span style="color:red">text</span> it simply outputs text.

Considering markdown sanitizers, it is common practice to remove disallowed tags keeping their text content. Of course this doesn't apply to the <script> tag.

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

I created a branch KEEP_CONTENT and started playing with insertAdjacentHTML which might actually give us exactly what's wanted in case it's set to AfterEnd and the node-removal happens right after the insertion.

Currently, I have to wrap the new code in a try/catch which I don't really like; on Blink, not all nodes allow insertion AfterEnd as they require a valid parent node and if this is not the case the insertion fails. So it's not optimal yet but maybe a step in the right direction. Feedback of course appreciated.

In this branch, for quick testing purposes, I removed <a> from the list of permitted elements and set KEEP_CONTENT to true.

from dompurify.

UltCombo avatar UltCombo commented on July 24, 2024

Oh nice work!

I'm working on some killer deadlines and I'm not as experienced in HTML sanitizing/browser DOM quirks as you guys, so I can't really make meaningful contributions at the moment.

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

The tests are so far green so I am optimistic. I'll merge later on and close the ticket. Thx :)

from dompurify.

mathiasbynens avatar mathiasbynens commented on July 24, 2024

Minor bikeshed: IMHO PRESERVE_CONTENT is a better name than KEEP_CONTENT.

from dompurify.

cure53 avatar cure53 commented on July 24, 2024

Denied, KEEP_CONTENT says the same and is shorter :)

from dompurify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.