GithubHelp home page GithubHelp logo

Comments (13)

mozfreddyb avatar mozfreddyb commented on July 4, 2024

If I were to write a customization on top of DOMPurify, I'd like to register call backs (the JS name for hooks :P) for events before and after the filtering of each (attribute, tag, entity).

This kinda matches what your proposal says, if I understand you correctly.

from dompurify.

arbixy avatar arbixy commented on July 4, 2024

It would be good that the current sanitize configuration is provided as parameter, closure or some getter to these event callbacks so that the user could modify it during the process in order to achieve better results before the next hooked stage or any api helping tools what so ever.

from dompurify.

cure53 avatar cure53 commented on July 4, 2024

@mozfreddyb True, that's what I meant.

@arbixy Yes, that would allow you to flatten, re-configure per node, enforce tag-attribute white-lists etc.

from dompurify.

arbixy avatar arbixy commented on July 4, 2024

Perhaps some query tools would be a great addition, too.
Or api for the ones used in the library itself.
I mean on a lower level than current.

from dompurify.

cure53 avatar cure53 commented on July 4, 2024

@arbixy What do you mean by query tools?

@ALL Now, let's do a quick dry run as to how it could look like. Let's assume we want to install a hook/callback in _initDocument():

Without

            /* Exit directly if we have nothing to do */
            if (typeof dirty === 'string' && dirty.indexOf('<') === -1) { 
                return dirty; 
            }

            /* Create documents to map markup to */
            var dom = document.implementation.createHTMLDocument('');
                dom.body.parentNode.removeChild(dom.body.parentNode.firstElementChild);
                dom.body.outerHTML = dirty;

            /* Cover IE9's buggy outerHTML behavior */
            if(dom.body === null) {
                dom = document.implementation.createHTMLDocument('');
                dom.body.innerHTML = dirty;
                if(dom.body.firstChild && dom.body.firstChild.nodeName
                    && !WHOLE_DOCUMENT
                    && dom.body.firstChild.nodeName === 'STYLE'){
                    dom.body.removeChild(dom.body.firstChild);
                }
            }

With Callback

            /* Exit directly if we have nothing to do */
            if (typeof dirty === 'string' && dirty.indexOf('<') === -1) { 
                return dirty; 
            }

            /* Create documents to map markup to */
            var dom = document.implementation.createHTMLDocument('');
                dom.body.parentNode.removeChild(dom.body.parentNode.firstElementChild);
                dom.body.outerHTML = dirty;

            /* NEWNEWNEW Basic Callback NEWNEWNEW */
            if(cfg.CALLBACKS['beforeFillDocument']){
                [dirty, dom, cfg] = cfg.CALLBACKS['beforeFillDocument'](dirty, dom, cfg)
            }  

            /* Cover IE9's buggy outerHTML behavior */
            if(dom.body === null) {
                dom = document.implementation.createHTMLDocument('');
                dom.body.innerHTML = dirty;
                if(dom.body.firstChild && dom.body.firstChild.nodeName
                    && !WHOLE_DOCUMENT
                    && dom.body.firstChild.nodeName === 'STYLE'){
                    dom.body.removeChild(dom.body.firstChild);
                }
            }

Of course we should have a dedicated callback loader that verifies if it's a function, has proper length (number of arguments) and what not. And for compatibility sake maybe no destructuring assignment etc. But the basic mechanism looks okay? Function objects stored in the config array and a direct call inside the DOMPurify methods?

from dompurify.

fhemberger avatar fhemberger commented on July 4, 2024

Just a rough idea how do handle/execute those hooks. Just open the console (on this page for example) and execute this:

var DOMPurify = (function() {
    /* browser:true, devel:true */
    'use strict';

    var DOMPurify = {};
    var hooks = {};


    DOMPurify.sanitize = function(dirty, cfg) {

        var _sanitizeElements = function(currentNode) {
            currentNode = _executeHook('beforeSantitizeElements', currentNode);

            // do the magic
            console.log('_sanitizeElements', currentNode);

            currentNode = _executeHook('afterSantitizeElements', currentNode);
        };


        var _executeHook = function(entryPoint, currentNode) {
            var modifiedNode;

            console.log('_executeHook:%s', entryPoint);
            hooks[entryPoint].forEach(function(hook) {
                modifiedNode = hook.call(DOMPurify, currentNode, cfg);

                if (!modifiedNode) { return console.error('Hook for "' + entryPoint + '" didn\'t returned a node. Skipping.'); }
                currentNode = modifiedNode;
            });

            return currentNode;
        };

        // Dummy execution
        _sanitizeElements( document.getElementsByTagName('body')[0] );
    };


    DOMPurify.addHook = function(entryPoint, hookFunction) {

        hooks[entryPoint] = hooks[entryPoint] || [];
        hooks[entryPoint].push(hookFunction);
    };

    return DOMPurify;
})();


// --- Test code ---
DOMPurify.addHook('beforeSantitizeElements', function(currentNode, config) {
    console.log('before:function 1');
    return currentNode;
});

DOMPurify.addHook('beforeSantitizeElements', function(currentNode, config) {
    console.log('before:function 2');
});

DOMPurify.addHook('afterSantitizeElements', function(currentNode, config) {
    console.log('after:function 1');
    return currentNode;
});

DOMPurify.sanitize();

Additionally, you could add a validation step in addHook to check if the argument passed is in an array of valid entry points and throw an error if not. We should also try to have a consistent signature for the hook function if possible (some hooks might execute on the raw string, some on the current node).

from dompurify.

cure53 avatar cure53 commented on July 4, 2024

Makes perfect sense, yes! Shall I add a branch for that? I think the earlier we have something to hack on the better. And the general idea seems to be robust enough to start playing.

from dompurify.

fhemberger avatar fhemberger commented on July 4, 2024

Sure, go ahead. ;)

from dompurify.

cure53 avatar cure53 commented on July 4, 2024

Created the HOOK_API branch.

from dompurify.

fhemberger avatar fhemberger commented on July 4, 2024

I just added my basic hook handling to it.

from dompurify.

cure53 avatar cure53 commented on July 4, 2024

See fbe0fa46c6cd042e3807de2661410e6d517036b0

Nice, thanks! I did some changes and added a bunch of demo cases to see how it could work. I also decided to rename the demo folder as the name was misleading.

Feedback very welcome. There's also a first hook - doing nothing but capitalizing text node content for demo purposes.

from dompurify.

cure53 avatar cure53 commented on July 4, 2024

I am mostly done with the rest of the hook implementation. I also added support for two new config flags and I think we are pretty much ready for a new release. What do you think?

1246610ec2431b8febc80f0096140acbef27eefd

from dompurify.

fhemberger avatar fhemberger commented on July 4, 2024

Cool, LGTM! 👍

from dompurify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.