GithubHelp home page GithubHelp logo

cure53 / dompurify Goto Github PK

View Code? Open in Web Editor NEW
13.1K 150.0 682.0 9.58 MB

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:

Home Page: https://cure53.de/purify

License: Other

HTML 3.35% JavaScript 96.51% Shell 0.14%
xss sanitizer dom security javascript dompurify prevent-xss-attacks mathml html svg

dompurify's Introduction

DOMPurify

npm version Build and Test Downloads npm package minimized gzipped size (select exports) GitHub code size in bytes dependents

NPM

DOMPurify is a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG.

It's also very simple to use and get started with. DOMPurify was started in February 2014 and, meanwhile, has reached version v3.1.5.

DOMPurify is written in JavaScript and works in all modern browsers (Safari (10+), Opera (15+), Edge, Firefox and Chrome - as well as almost anything else using Blink, Gecko or WebKit). It doesn't break on MSIE or other legacy browsers. It simply does nothing.

Note that DOMPurify v2.5.5 is the latest version supporting MSIE. For important security updates compatible with MSIE, please use the 2.x branch.

Our automated tests cover 19 different browsers right now, more to come. We also cover Node.js v16.x, v17.x, v18.x and v19.x, running DOMPurify on jsdom. Older Node versions are known to work as well, but hey... no guarantees.

DOMPurify is written by security people who have vast background in web attacks and XSS. Fear not. For more details please also read about our Security Goals & Threat Model. Please, read it. Like, really.

What does it do?

DOMPurify sanitizes HTML and prevents XSS attacks. You can feed DOMPurify with string full of dirty HTML and it will return a string (unless configured otherwise) with clean HTML. DOMPurify will strip out everything that contains dangerous HTML and thereby prevent XSS attacks and other nastiness. It's also damn bloody fast. We use the technologies the browser provides and turn them into an XSS filter. The faster your browser, the faster DOMPurify will be.

How do I use it?

It's easy. Just include DOMPurify on your website.

Using the unminified development version

<script type="text/javascript" src="src/purify.js"></script>

Using the minified and tested production version (source-map available)

<script type="text/javascript" src="dist/purify.min.js"></script>

Afterwards you can sanitize strings by executing the following code:

const clean = DOMPurify.sanitize(dirty);

Or maybe this, if you love working with Angular or alike:

import DOMPurify from 'dompurify';

const clean = DOMPurify.sanitize('<b>hello there</b>');

The resulting HTML can be written into a DOM element using innerHTML or the DOM using document.write(). That is fully up to you. Note that by default, we permit HTML, SVG and MathML. If you only need HTML, which might be a very common use-case, you can easily set that up as well:

const clean = DOMPurify.sanitize(dirty, { USE_PROFILES: { html: true } });

Where are the TypeScript type definitions?

They can be found here: @types/dompurify

Is there any foot-gun potential?

Well, please note, if you first sanitize HTML and then modify it afterwards, you might easily void the effects of sanitization. If you feed the sanitized markup to another library after sanitization, please be certain that the library doesn't mess around with the HTML on its own.

Okay, makes sense, let's move on

After sanitizing your markup, you can also have a look at the property DOMPurify.removed and find out, what elements and attributes were thrown out. Please do not use this property for making any security critical decisions. This is just a little helper for curious minds.

Running DOMPurify on the server

DOMPurify technically also works server-side with Node.js. Our support strives to follow the Node.js release cycle.

Running DOMPurify on the server requires a DOM to be present, which is probably no surprise. Usually, jsdom is the tool of choice and we strongly recommend to use the latest version of jsdom.

Why? Because older versions of jsdom are known to be buggy in ways that result in XSS even if DOMPurify does everything 100% correctly. There are known attack vectors in, e.g. jsdom v19.0.0 that are fixed in jsdom v20.0.0 - and we really recommend to keep jsdom up to date because of that.

Please also be aware that tools like happy-dom exist but are not considered safe at this point. Combining DOMPurify with happy-dom is currently not recommended and will likely lead to XSS.

Other than that, you are fine to use DOMPurify on the server. Probably. This really depends on jsdom or whatever DOM you utilize server-side. If you can live with that, this is how you get it to work:

npm install dompurify
npm install jsdom

For jsdom (please use an up-to-date version), this should do the trick:

const createDOMPurify = require('dompurify');
const { JSDOM } = require('jsdom');

const window = new JSDOM('').window;
const DOMPurify = createDOMPurify(window);
const clean = DOMPurify.sanitize('<b>hello there</b>');

Or even this, if you prefer working with imports:

import { JSDOM } from 'jsdom';
import DOMPurify from 'dompurify';

const window = new JSDOM('').window;
const purify = DOMPurify(window);
const clean = purify.sanitize('<b>hello there</b>');

If you have problems making it work in your specific setup, consider looking at the amazing isomorphic-dompurify project which solves lots of problems people might run into.

npm install isomorphic-dompurify
import DOMPurify from 'isomorphic-dompurify';

const clean = DOMPurify.sanitize('<s>hello</s>');

Is there a demo?

Of course there is a demo! Play with DOMPurify

What if I find a security bug?

First of all, please immediately contact us via email so we can work on a fix. PGP key

Also, you probably qualify for a bug bounty! The fine folks over at Fastmail use DOMPurify for their services and added our library to their bug bounty scope. So, if you find a way to bypass or weaken DOMPurify, please also have a look at their website and the bug bounty info.

Some purification samples please?

How does purified markup look like? Well, the demo shows it for a big bunch of nasty elements. But let's also show some smaller examples!

DOMPurify.sanitize('<img src=x onerror=alert(1)//>'); // becomes <img src="x">
DOMPurify.sanitize('<svg><g/onload=alert(2)//<p>'); // becomes <svg><g></g></svg>
DOMPurify.sanitize('<p>abc<iframe//src=jAva&Tab;script:alert(3)>def</p>'); // becomes <p>abc</p>
DOMPurify.sanitize('<math><mi//xlink:href="data:x,<script>alert(4)</script>">'); // becomes <math><mi></mi></math>
DOMPurify.sanitize('<TABLE><tr><td>HELLO</tr></TABL>'); // becomes <table><tbody><tr><td>HELLO</td></tr></tbody></table>
DOMPurify.sanitize('<UL><li><A HREF=//google.com>click</UL>'); // becomes <ul><li><a href="//google.com">click</a></li></ul>

What is supported?

DOMPurify currently supports HTML5, SVG and MathML. DOMPurify per default allows CSS, HTML custom data attributes. DOMPurify also supports the Shadow DOM - and sanitizes DOM templates recursively. DOMPurify also allows you to sanitize HTML for being used with the jQuery $() and elm.html() API without any known problems.

What about legacy browsers like Internet Explorer?

DOMPurify does nothing at all. It simply returns exactly the string that you fed it. DOMPurify exposes a property called isSupported, which tells you whether it will be able to do its job, so you can come up with your own backup plan.

What about DOMPurify and Trusted Types?

In version 1.0.9, support for Trusted Types API was added to DOMPurify. In version 2.0.0, a config flag was added to control DOMPurify's behavior regarding this.

When DOMPurify.sanitize is used in an environment where the Trusted Types API is available and RETURN_TRUSTED_TYPE is set to true, it tries to return a TrustedHTML value instead of a string (the behavior for RETURN_DOM and RETURN_DOM_FRAGMENT config options does not change).

Note that in order to create a policy in trustedTypes using DOMPurify, RETURN_TRUSTED_TYPE: false is required, as createHTML expects a normal string, not TrustedHTML. The example below shows this.

window.trustedTypes!.createPolicy('default', {
  createHTML: (to_escape) =>
    DOMPurify.sanitize(to_escape, { RETURN_TRUSTED_TYPE: false }),
});

Can I configure DOMPurify?

Yes. The included default configuration values are pretty good already - but you can of course override them. Check out the /demos folder to see a bunch of examples on how you can customize DOMPurify.

General settings

// strip {{ ... }}, ${ ... } and <% ... %> to make output safe for template systems
// be careful please, this mode is not recommended for production usage.
// allowing template parsing in user-controlled HTML is not advised at all.
// only use this mode if there is really no alternative.
const clean = DOMPurify.sanitize(dirty, {SAFE_FOR_TEMPLATES: true});


// change how e.g. comments containing risky HTML characters are treated.
const clean = DOMPurify.sanitize(dirty, {SAFE_FOR_XML: false});

Control our allow-lists and block-lists

// allow only <b> elements, very strict
const clean = DOMPurify.sanitize(dirty, {ALLOWED_TAGS: ['b']});

// allow only <b> and <q> with style attributes
const clean = DOMPurify.sanitize(dirty, {ALLOWED_TAGS: ['b', 'q'], ALLOWED_ATTR: ['style']});

// allow all safe HTML elements but neither SVG nor MathML
// note that the USE_PROFILES setting will override the ALLOWED_TAGS setting
// so don't use them together
const clean = DOMPurify.sanitize(dirty, {USE_PROFILES: {html: true}});

// allow all safe SVG elements and SVG Filters, no HTML or MathML
const clean = DOMPurify.sanitize(dirty, {USE_PROFILES: {svg: true, svgFilters: true}});

// allow all safe MathML elements and SVG, but no SVG Filters
const clean = DOMPurify.sanitize(dirty, {USE_PROFILES: {mathMl: true, svg: true}});

// change the default namespace from HTML to something different
const clean = DOMPurify.sanitize(dirty, {NAMESPACE: 'http://www.w3.org/2000/svg'});

// leave all safe HTML as it is and add <style> elements to block-list
const clean = DOMPurify.sanitize(dirty, {FORBID_TAGS: ['style']});

// leave all safe HTML as it is and add style attributes to block-list
const clean = DOMPurify.sanitize(dirty, {FORBID_ATTR: ['style']});

// extend the existing array of allowed tags and add <my-tag> to allow-list
const clean = DOMPurify.sanitize(dirty, {ADD_TAGS: ['my-tag']});

// extend the existing array of allowed attributes and add my-attr to allow-list
const clean = DOMPurify.sanitize(dirty, {ADD_ATTR: ['my-attr']});

// prohibit ARIA attributes, leave other safe HTML as is (default is true)
const clean = DOMPurify.sanitize(dirty, {ALLOW_ARIA_ATTR: false});

// prohibit HTML5 data attributes, leave other safe HTML as is (default is true)
const clean = DOMPurify.sanitize(dirty, {ALLOW_DATA_ATTR: false});

Control behavior relating to Custom Elements

// DOMPurify allows to define rules for Custom Elements. When using the CUSTOM_ELEMENT_HANDLING
// literal, it is possible to define exactly what elements you wish to allow (by default, none are allowed).
//
// The same goes for their attributes. By default, the built-in or configured allow.list is used.
//
// You can use a RegExp literal to specify what is allowed or a predicate, examples for both can be seen below.
// The default values are very restrictive to prevent accidental XSS bypasses. Handle with great care!

const clean = DOMPurify.sanitize(
    '<foo-bar baz="foobar" forbidden="true"></foo-bar><div is="foo-baz"></div>',
    {
        CUSTOM_ELEMENT_HANDLING: {
            tagNameCheck: null, // no custom elements are allowed
            attributeNameCheck: null, // default / standard attribute allow-list is used
            allowCustomizedBuiltInElements: false, // no customized built-ins allowed
        },
    }
); // <div is=""></div>

const clean = DOMPurify.sanitize(
    '<foo-bar baz="foobar" forbidden="true"></foo-bar><div is="foo-baz"></div>',
    {
        CUSTOM_ELEMENT_HANDLING: {
            tagNameCheck: /^foo-/, // allow all tags starting with "foo-"
            attributeNameCheck: /baz/, // allow all attributes containing "baz"
            allowCustomizedBuiltInElements: true, // customized built-ins are allowed
        },
    }
); // <foo-bar baz="foobar"></foo-bar><div is="foo-baz"></div>

const clean = DOMPurify.sanitize(
    '<foo-bar baz="foobar" forbidden="true"></foo-bar><div is="foo-baz"></div>',
    {
        CUSTOM_ELEMENT_HANDLING: {
            tagNameCheck: (tagName) => tagName.match(/^foo-/), // allow all tags starting with "foo-"
            attributeNameCheck: (attr) => attr.match(/baz/), // allow all containing "baz"
            allowCustomizedBuiltInElements: true, // allow customized built-ins
        },
    }
); // <foo-bar baz="foobar"></foo-bar><div is="foo-baz"></div>

Control behavior relating to URI values

// extend the existing array of elements that can use Data URIs
const clean = DOMPurify.sanitize(dirty, {ADD_DATA_URI_TAGS: ['a', 'area']});

// extend the existing array of elements that are safe for URI-like values (be careful, XSS risk)
const clean = DOMPurify.sanitize(dirty, {ADD_URI_SAFE_ATTR: ['my-attr']});

Control permitted attribute values

// allow external protocol handlers in URL attributes (default is false, be careful, XSS risk)
// by default only http, https, ftp, ftps, tel, mailto, callto, sms, cid and xmpp are allowed.
const clean = DOMPurify.sanitize(dirty, {ALLOW_UNKNOWN_PROTOCOLS: true});

// allow specific protocols handlers in URL attributes via regex (default is false, be careful, XSS risk)
// by default only http, https, ftp, ftps, tel, mailto, callto, sms, cid and xmpp are allowed.
// Default RegExp: /^(?:(?:(?:f|ht)tps?|mailto|tel|callto|sms|cid|xmpp):|[^a-z]|[a-z+.\-]+(?:[^a-z+.\-:]|$))/i;
const clean = DOMPurify.sanitize(dirty, {ALLOWED_URI_REGEXP: /^(?:(?:(?:f|ht)tps?|mailto|tel|callto|sms|cid|xmpp|xxx):|[^a-z]|[a-z+.\-]+(?:[^a-z+.\-:]|$))/i});

Influence the return-type

// return a DOM HTMLBodyElement instead of an HTML string (default is false)
const clean = DOMPurify.sanitize(dirty, {RETURN_DOM: true});

// return a DOM DocumentFragment instead of an HTML string (default is false)
const clean = DOMPurify.sanitize(dirty, {RETURN_DOM_FRAGMENT: true});

// use the RETURN_TRUSTED_TYPE flag to turn on Trusted Types support if available
const clean = DOMPurify.sanitize(dirty, {RETURN_TRUSTED_TYPE: true}); // will return a TrustedHTML object instead of a string if possible

// use a provided Trusted Types policy
const clean = DOMPurify.sanitize(dirty, {
    // supplied policy must define createHTML and createScriptURL
    TRUSTED_TYPES_POLICY: trustedTypes.createPolicy({
        createHTML(s) { return s},
        createScriptURL(s) { return s},
    }
});

Influence how we sanitize

// return entire document including <html> tags (default is false)
const clean = DOMPurify.sanitize(dirty, {WHOLE_DOCUMENT: true});

// disable DOM Clobbering protection on output (default is true, handle with care, minor XSS risks here)
const clean = DOMPurify.sanitize(dirty, {SANITIZE_DOM: false});

// enforce strict DOM Clobbering protection via namespace isolation (default is false)
// when enabled, isolates the namespace of named properties (i.e., `id` and `name` attributes)
// from JS variables by prefixing them with the string `user-content-`
const clean = DOMPurify.sanitize(dirty, {SANITIZE_NAMED_PROPS: true});

// keep an element's content when the element is removed (default is true)
const clean = DOMPurify.sanitize(dirty, {KEEP_CONTENT: false});

// glue elements like style, script or others to document.body and prevent unintuitive browser behavior in several edge-cases (default is false)
const clean = DOMPurify.sanitize(dirty, {FORCE_BODY: true});

// remove all <a> elements under <p> elements that are removed
const clean = DOMPurify.sanitize(dirty, {FORBID_CONTENTS: ['a'], FORBID_TAGS: ['p']});

// change the parser type so sanitized data is treated as XML and not as HTML, which is the default
const clean = DOMPurify.sanitize(dirty, {PARSER_MEDIA_TYPE: 'application/xhtml+xml'});

Influence where we sanitize

// use the IN_PLACE mode to sanitize a node "in place", which is much faster depending on how you use DOMPurify
const dirty = document.createElement('a');
dirty.setAttribute('href', 'javascript:alert(1)');

const clean = DOMPurify.sanitize(dirty, {IN_PLACE: true}); // see https://github.com/cure53/DOMPurify/issues/288 for more info

There is even more examples here, showing how you can run, customize and configure DOMPurify to fit your needs.

Persistent Configuration

Instead of repeatedly passing the same configuration to DOMPurify.sanitize, you can use the DOMPurify.setConfig method. Your configuration will persist until your next call to DOMPurify.setConfig, or until you invoke DOMPurify.clearConfig to reset it. Remember that there is only one active configuration, which means once it is set, all extra configuration parameters passed to DOMPurify.sanitize are ignored.

Hooks

DOMPurify allows you to augment its functionality by attaching one or more functions with the DOMPurify.addHook method to one of the following hooks:

  • beforeSanitizeElements
  • uponSanitizeElement (No 's' - called for every element)
  • afterSanitizeElements
  • beforeSanitizeAttributes
  • uponSanitizeAttribute
  • afterSanitizeAttributes
  • beforeSanitizeShadowDOM
  • uponSanitizeShadowNode
  • afterSanitizeShadowDOM

It passes the currently processed DOM node, when needed a literal with verified node and attribute data and the DOMPurify configuration to the callback. Check out the MentalJS hook demo to see how the API can be used nicely.

Example:

DOMPurify.addHook(
  'uponSanitizeAttribute',
  function (currentNode, hookEvent, config) {
    // Do something with the current node
    // You can also mutate hookEvent for current node (i.e. set hookEvent.forceKeepAttr = true)
    // For other than 'uponSanitizeAttribute' hook types hookEvent equals to null
  }
);

Continuous Integration

We are currently using Github Actions in combination with BrowserStack. This gives us the possibility to confirm for each and every commit that all is going according to plan in all supported browsers. Check out the build logs here: https://github.com/cure53/DOMPurify/actions

You can further run local tests by executing npm test. The tests work fine with Node.js v0.6.2 and [email protected].

All relevant commits will be signed with the key 0x24BB6BF4 for additional security (since 8th of April 2016).

Development and contributing

Installation (npm i)

We support npm officially. GitHub Actions workflow is configured to install dependencies using npm. When using deprecated version of npm we can not fully ensure the versions of installed dependencies which might lead to unanticipated problems.

Scripts

We rely on npm run-scripts for integrating with our tooling infrastructure. We use ESLint as a pre-commit hook to ensure code consistency. Moreover, to ease formatting we use prettier while building the /dist assets happens through rollup.

These are our npm scripts:

  • npm run dev to start building while watching sources for changes
  • npm run test to run our test suite via jsdom and karma
    • test:jsdom to only run tests through jsdom
    • test:karma to only run tests through karma
  • npm run lint to lint the sources using ESLint (via xo)
  • npm run format to format our sources using prettier to ease to pass ESLint
  • npm run build to build our distribution assets minified and unminified as a UMD module
    • npm run build:umd to only build an unminified UMD module
    • npm run build:umd:min to only build a minified UMD module

Note: all run scripts triggered via npm run <script>.

There are more npm scripts but they are mainly to integrate with CI or are meant to be "private" for instance to amend build distribution files with every commit.

Security Mailing List

We maintain a mailing list that notifies whenever a security-critical release of DOMPurify was published. This means, if someone found a bypass and we fixed it with a release (which always happens when a bypass was found) a mail will go out to that list. This usually happens within minutes or few hours after learning about a bypass. The list can be subscribed to here:

https://lists.ruhr-uni-bochum.de/mailman/listinfo/dompurify-security

Feature releases will not be announced to this list.

Who contributed?

Many people helped and help DOMPurify become what it is and need to be acknowledged here!

hash_kitten โค๏ธ, kevin_mizu โค๏ธ, icesfont โค๏ธ dcramer ๐Ÿ’ธ, JGraph ๐Ÿ’ธ, baekilda ๐Ÿ’ธ, Healthchecks ๐Ÿ’ธ, Sentry ๐Ÿ’ธ, jarrodldavis ๐Ÿ’ธ, CynegeticIO, ssi02014 โค๏ธ, GrantGryczan, Lowdefy, granlem, oreoshake, tdeekens โค๏ธ, peernohell โค๏ธ, is2ei, SoheilKhodayari, franktopel, NateScarlet, neilj, fhemberger, Joris-van-der-Wel, ydaniv, terjanq, filedescriptor, ConradIrwin, gibson042, choumx, 0xSobky, styfle, koto, tlau88, strugee, oparoz, mathiasbynens, edg2s, dnkolegov, dhardtke, wirehead, thorn0, styu, mozfreddyb, mikesamuel, jorangreef, jimmyhchan, jameydeorio, jameskraus, hyderali, hansottowirtz, hackvertor, freddyb, flavorjones, djfarrelly, devd, camerondunford, buu700, buildog, alabiaga, Vector919, Robbert, GreLI, FuzzySockets, ArtemBernatskyy, @garethheyes, @shafigullin, @mmrupp, @irsdl,ShikariSenpai, ansjdnakjdnajkd, @asutherland, @mathias, @cgvwzq, @robbertatwork, @giutro, @CmdEngineer_, @avr4mit and especially @securitymb โค๏ธ & @masatokinugawa โค๏ธ

Testing powered by


And last but not least, thanks to BrowserStack Open-Source Program for supporting this project with their services for free and delivering excellent, dedicated and very professional support on top of that.

dompurify's People

Contributors

0xedward avatar 0xsobky avatar conradirwin avatar cure53 avatar dejang avatar dependabot[bot] avatar fhemberger avatar filedescriptor avatar franktopel avatar gibson042 avatar grantgryczan avatar is2ei avatar joris-van-der-wel avatar kevin-deyoungster avatar koto avatar malvoz avatar mscheele7 avatar natescarlet avatar neilj avatar peernohell avatar pomierski avatar securitum-mb avatar ssi02014 avatar styfle avatar tdeekens avatar tiny-ben-tran avatar tlau88 avatar tosmolka avatar vlad-borodaev avatar ydaniv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dompurify's Issues

Provide an API to allow hooking into _sanitizeElements and _sanitizeAttributes

It appears that there's a lot of special cases for developers to cover, like removal of elements but having the text content survive (see #15). I don't want to cover those issues in DOMPurify as I want to keep the code-base small and functionality limited for security reasons.

I am wondering if it makes sense to provide an API for the two core methods _sanitizeElements and _sanitizeAttributes so people can hook their own code based on what these functions return to influence the final markup DOMPurify creates.

Thoughts very much welcome!

Registering hook leads to infinite loop in IE 10 / IE 11

Either I am doing something very obvious wrong or registering a hook (I tested afterSanitizeElements and beforeSanitizeElements) and then calling the sanitize-method leads to an infinite loop in IE 10 and IE 11, which hangs up the browser. If I remove the hook, everything works fine. My JS Code just does this:

DOMPurify.addHook('afterSanitizeElements', function(node) {
  if (node.nodeType && node.nodeType === document.TEXT_NODE) {
    node.textContent = 'foo';
  }

  return node;
});
var editor = document.querySelector('.editor');
document.querySelector('.button').addEventListener('click', function () {
  var dirty = '<div><p>This is a beatufiul text</p><p>This is too</p></div>';
  editor.innerHTML = DOMPurify.sanitize(dirty);
});

I would expect it to keep the DOM-structure intact and just replace the text-nodes with "foo".

I created a pen which is enough to cause the error in IE 10 and IE 11. Other browsers (I tested Edge, Chrome, Firefox and Safari) look fine to me: http://codepen.io/anon/pen/YywGXz

Should we protect against DOM clobbering by default?

Right now, DOMPurify protects itself against DOM Clobbering. But the resulting markup itself still has clobbering potential when being used by the sanitizing website.

Should be by default prevent that? My thought would be: If an element's name is at the same time a property in document, the element should be removed. What do you think? We would prevent XSS and DOM clobbering.

//cc @fhemberger @mathiasbynens

Minor clobbering issue of NamedNodeMap

Commit 11d778e and 6f3a06c changed the clobber check of elm.attributes with instanceof. However, this check enables attackers to clobber window.NamedNodeMap in FF22-34 and terminate DOMPurify. This can be done by changing an existing iframe in a page to a crafted page before loading DOMPurify to pollute window.

PoC
Open http://jsbin.com/sozogi/2/ in FF22-34. An error should be thrown.

I think we should revert the changes to the older one.

Nested tags sanitation

How about nested tags sanitation support?

Let's say that only first level <p> tags are allowed and for example <p>Hello <p>world</p></p> should be sanitized into <p>Hello world</p>

or only second level resulting in Hello <p>world</p>

Best Regards

Host DOMPurify on NPM

It would be great if DOMPurify was hosted on NPM.

This would make it easier to re-use DOMPurify in the Node.js context rather than in the browser and also make it easier to manage it as a dependency.

This is not an issue or bug, just a "nice to have" feature.

Option to return a DocumentFragment instead of html

The way I am currently using dompurify looks like this:

var html = this.databaseEntity.foo;
html = dompurify(this.document.defaultView).sanitize(html);
var fragment = parseHTMLSnippit(this.document.defaultView, html);
this.bar.appendChild(fragment);

Which means the html is being parsed twice. It would be useful if there was an option in dompurify to return a DocumentFragment. (adopted/imported into the document dompurify has been constructed with).

For example:

var html = this.databaseEntity.foo;
var fragment = dompurify(this.document.defaultView).sanitize(html, {output: 'fragment'});
this.bar.appendChild(fragment);

Change CONTENT_TAGS to blacklist

I wanted to check this before submitting a pull request: at the moment, if you use KEEP_CONTENT, the tag has to match a whitelist for its contents to be kept. This is not necessary for security, since the contents is still sanitised just like any other content in the page, and is annoying if you are sanitising random HTML from sources which may use arbitrary non-standard tags to wrap bits of content (resulting in chunks being stripped by the sanitiser).

I suggest changing the CONTENT_TAGS whitelist to a FORBID_CONTENTS blacklist; the blacklist is not needed for security, but we do want to remove the contents of script tags and perhaps a few others by default, since we know they are not designed to have user-visible content.

Anyone have a problem with this/see a security flaw I've missed? If not, I'll submit a pull request.

Warnings mode?

Often in big web applications, one might have a good cut point to always insert purify calls. But, you can't be sure it won't break stuff. It would really help with rollout if we could pass a flag to dompurify where it would go through the tree and just get a list of nodes it would have removed if it was in enforcement mode. This can help track down nasty corner cases (which might even be vulns) and then once the warnings go down to zero, people can turn on enforcement mode.

This sort of stuff really helps in large scale deployment. CSP has this going for it.

What do you think? The simpler idea of "call purify and compare return value to input" doesn't work because the html tree serializations might have minor differences.

License issue

We want to use lib at LinkedIn, but legal department have concern about MPL 2.0 License that is slightly copyleft, please consider to use one of other open licenses e.g.:

  • Apache 2.0 License
  • BSD "2-clause" or "Simplified" License
  • BSD "3-clause" License
  • MIT License

Thanks.

rel=noreferrer?

Happy to submit a PR, but not sure if we want this.

Basically, does it make sense to default to rel=noreferrer on anchors (and I think img also supports it). Obviously, doesn't work on all platforms, but is a reasonable default for untrusted content. Stops referrer leaks and hijacking etc.

Plugin API / Hooks for custom filters and sanitation

Several users have recently requested specific filtering and sanitation features that would require core changes or rethinking of the configuration API. This includes custom nesting rules, different href/src filtering and many other things.

I believe it would be best to create a hooking / plugin architecture instead of adding additional complexity to the core API. Now, the question is: how to best build that? How to do it so it makes sense, works well with the existing config flags and still provides enough flexibility for developers to add their own functionality?

My initial thought was: Hooks. We can implement hooks where the developer receives all DOMPurify config flags, the string and its current representation object, can work on all of them and pass them back after being done. We can put hooks into several places:

  • Before any filtering has been done
  • Right after the document has been created
  • Right after the clobbering checks happened
  • Right after the tag filtering has completed
  • Right after the attribute filtering has completed
  • Right before the serialization happens

This should allow us to do things like:

  • Specific handling of charset directives
  • Filtering to prevent deep nesting
  • Handling of CSS
  • Handling of HTTP leaks such as src, href, formaction etc.
  • Handling of special unwanted characters

Does that general strategy make sense or am I living in the 90s with that idea? :) Very much open for any form of input, please let me know.

Spammy CC of all who requested specific features or a in core development - sorry for that but your input is needed :) @fhemberger @devd @arbixy @eoftedal @joelabair

Issue rendering CSS with <style> being the first element

DOMPurify is integrated within the ModSecurity WAF XSS Evasion Demo here -
http://www.modsecurity.org/demo/demo-deny-noescape.html

Here is an example link that injects DOMPurify in the response -
http://www.modsecurity.org/demo/demo-deny-noescape.html?test=%3Cscript&enable_dompurify_defense=on&disable_browser_xss_defense=on

Notice that the CSS data does not render properly. Mario mentioned the following -

"Known issue is <style> being the first element, will go to doc.head instead doc.body. Which is even correct by spec, a config flag will fix that soon. Style attributes however are supposed to work."

Pre-test / Performance optimization

Would it become a problem if DOMPurify checks each string to sanitize for the occurrence of the character "<" first? Do you see a risk if a string not containing any "<" is returned as such without checks and modifications?

I think we might save a bit on performance but want to be sure there's no foot-gun potential.

Src of <img> whose value is base64 encoded, is removed

The src of images with base64 encoded values starts as data:image and it matches the regex in this line, and this attribute is not added to the dom, causing an image without src to be appended. The comment says Safely handle custom data attributes, but not sure why it is done.

Sanitize MIME message

I'm trying to sanitze a MIME message:

MIME-Version: 1.0
Sender: [email protected]
Received: by 10.60.155.104 with HTTP; Mon, 5 May 2014 06:08:54 -0700 (PDT)
Date: Mon, 5 May 2014 15:08:54 +0200
Delivered-To: [email protected]
Message-ID: <CAMQ7_A76jtf9Q1bx=G_K9miTdy0C2LPo7xuECNmGpDzmQPcZ4A@mail.gmail.com>
Subject: Hello
From: =?UTF-8?Q?xxx?= <[email protected]>
To: =?UTF-8?Q?xxx?= <[email protected]>
Content-Type: multipart/alternative; boundary=089e0115e7e4537fcd04f8a6d552

--089e0115e7e4537fcd04f8a6d552
Content-Type: text/plain; charset=UTF-8

Test

--089e0115e7e4537fcd04f8a6d552
Content-Type: text/html; charset=UTF-8

<div dir="ltr">Test</div>

--089e0115e7e4537fcd04f8a6d552--

and get with default 0.3 the following output:

MIME-Version: 1.0
Sender: [email protected]
Received: by 10.60.155.104 with HTTP; Mon, 5 May 2014 06:08:54 -0700 (PDT)
Date: Mon, 5 May 2014 15:08:54 +0200
Delivered-To: [email protected]
Message-ID: 

Is it intended that everything after a <[email protected]> is truncated? And if yes, is there a setting to prevent that?

Thanks.

Drops charset declarations

If a document in the <head>-tag contains:

<meta http-equiv="Content-type" content="text/html;charset=UTF-8">

the declaration will be dropped. This may lead to character set problems.

Test-suite acts non-deterministically on IE11

Currently, all tests pass on IE11 and below. However sometimes the clobbering check doesn't seem to work and two specific tests fail:

  • onsubmit, onfocus; DOM clobbering: children
  • onsubmit, onfocus; DOM clobbering: attributes

This is only happening on IE11, Windows 7 and Windows 8.1.
I cannot figure out, what this is, DOMPurify acts normally in the demo and in real-life scenarios. Maybe a QUnit race condition? (@fhemberger)

Bug bounty

Hi Team,

I thought you'd be interested to know we have now rolled out DOMPurify into production for all FastMail users as an extra layer of XSS protection (on top of our server-side filter and strict content security policy). As such, we would like to extend our bug bounty to cover security bugs in the DOMPurify library, even when they would not be exploitable on FastMail due to the other two protection mechanisms.

To qualify for the bounty, we would expect the reporter to responsibly disclose the issue to any of the library maintainers (we would be grateful for a head's up, but we monitor the security mailing list too) or to ourselves (in which case we would obviously notify you!). Other than that, the same rules and rewards as the standard FastMail bug bounty would be applicable (with rewards ranging from $100 to $5000 depending on severity and what browsers are affected), and of course any award would still be ultimately at our discretion.

If you're happy with this, please feel free to advertise it. I hope that this can encourage further scrutiny of the library, ultimately making it safer for everyone.

Neil.

jQuery Plugin

I am not even sure if this is a great idea but these few lines can make DOMPurify also work as a jquery extension; adding a safehtml function.

(function($){
    "use strict";
    $.fn.safehtml = function(str) {
        var clean = DOMPurify.sanitize(str, { 'RETURN_DOM': true }); 
        this.empty();
        this.append(clean.childNodes);
        return this;
    })(jquery);

Minor changes can make it work with require/amd definitions too.

One concern is that there are so many other vectors for XSS. While there is value to making safehtml available (and encourage people to use it instead of safeHTML), do we write "safe" versions of everything?

https://code.google.com/p/domxsswiki/wiki/jQuery

So, I guess this is mainly an issue to start off a discussion about this.

Cannot remove hook

Thanks for the useful lib. Unfortunately it is not possible to remove a hook because they are held as global state here:

https://github.com/cure53/DOMPurify/blob/master/purify.js#L16

E.g. if a want to sanitize one html email while removing 'src' attributes from image tags and then for the second html mail, I want to allow images:

// remove sources of image tags to prevent privacy leak during resource fetching
if (removeImages) {
    window.DOMPurify.addHook('afterSanitizeAttributes', function(node) {
        if ('src' in node) {
            node.removeAttribute('src');
        }
    });
}

// sanitize HTML content: https://github.com/cure53/DOMPurify
html = window.DOMPurify.sanitize(html);

If 'removeImages' is false the second time the hook is still called.

<template> cannot appear alone.

For example, <template> is removed but <div><template> is alright. Not sure if it is intended behavior?

Tested in latest Chrome and Firefox with DOMPurify 0.4.3.

sanitize() function seems to reverse the order of the attributes

It looks like the order of attributes is being reversed in the output from sanitize().
While from from the point of view of functionality it makes no difference, it is something that will trip over people that are writing unit tests using this library.

That said, thank you very much for this product!

Should use template instead of document.implementation.createHTMLDocument to avoid the web-component registry of the owner doc being reused

_initDocument uses document.implementation.createHTMLDocument() to create a new data document. Because of http://w3c.github.io/webcomponents/spec/custom/#creating-and-passing-registries this is potentially not as safe as it used to be. Specifically, the newly created document will use the same web-components registry as its execution context, which means that if a dirty HTML string is passed in, a nefarious string could potentially leverage any registered web-components to nefarious ends.

Probably the best course of action is to do something like:

var doc = document.createElement('template').content.ownerDocument;

instead of the current document.implementation.createHTMLDocument(). (Note that since it's its own document, I don't think the template node would need to be linked into the DOM, but if something seems wrong, that could be it. At least in Gecko, things like CSS parsing won't usefully happen until a DOM node is parented into the tree.)

Greater context: We got burnt by this in a non-security context when doing cloneNode, fix part of a bug at https://bugzilla.mozilla.org/show_bug.cgi?id=1116087#c12.

Update npm release

We're switching to browserify for our build process and I saw that your current master branch exports a common.js module, which is great.

It seems though that the npm package at 0.4.2 does not seem to be up to date, so an npm publish would be nice. Thanks!

More powerful and simple Hook API

After having worked with the hooks for a while, I think we could make them more powerful.

For example, I don't want to repeat the whole process of determining whether an attribute is an attribute and the node is a node.

I was thus thinking about passing a meta object with every hook call, that passes along some relevant info about the element/attribute to inspect. Any thoughts?

Tests for config flags

What's the best way to be able to test the config flags DOMPurify offers?

Currently, we basically test the default config. Ideally, we have coverage over all existing config flags. Even more ideally with coverage for different flags that might interact and interfere.

Could you provide code examples of what is being sanitized?

I'm looking at the demo that is linked in your README. I don't see what is being sanitized. E.g. the alert(1) values are still present in the output.

Could you write a section in the README which gives a few code examples (before/after)?

Uncaught TypeError: Cannot read property 'nodeName' of undefined

In Google Chrome this test:

<img src=data:image/jpeg,ab798ewqxbaudbuoibeqbla>

generates this error:

Uncaught TypeError: Cannot read property 'nodeName' of undefined purify.js:340

So, every test with a src attribute and a data: value at beginning is not well "accepted" by Chrome.

Test DOMPurify with a Browser grid

DOMPurify is designed to run in browsers. In ALL browsers. I would like to make some minor changes, but would like to be sure not to break anything. So I would like to have tests in a LOT of browsers in different versions and OSs. Wouldn't it be nice to have tests with a lot of browser/os combinations? There are some vendors, who offer that for free for open source projects like BrowserStack, SauceLabs or BrowserSwarm (I did not evaluate any of them for features or even fitness).
I think, this test support is more important, than my suggesting changes.
Are there any plans to do things like that? If not, I would like to do some research in this area.

Bower version mismatch

Adding "dompurify": "~0.3" to bower.json leads to:

bower dompurify#~0.3 ENORESTARGET No tag found that was able to satisfy ~0.3

"dompurify": "v0.3" works

For semantic versioning it seems Bower requires git tags to be named x.x(.x)

Use `tests/expect.json` on the demo

Ideally, the examples on the demo page would be loaded from the tests/expect.json file, so that we only have to maintain the list in one place.

tests/expect.json can also be used for unit testing / regression testing (#5).

Package description is too long for npm

Look here: https://www.npmjs.com/package/dompurify

This is what it looks like:

dompurify

DOMPurify is a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. It's written in JavaScript and works in all modern browsers (Safari, Opera (15+), Internet Explorer (10+), Firefox and Chrome - as well as almost anything else usin

DOMPurify is written by security people who have vast background in web attacks and XSS. Fear not. For more details please also read about our Security Goals & Threat Model

Breaking up that second line using a newline in the README should probably fix it.

Block Network requests

Again, not sure if this is meaningful, but does it make sense for DOMPurify to support an option where all elements that could result in network requests are removed? Just removing src, href etc attributes might be enough?

Demo Output Confusing

Hey guys!

On the Demo page.. what exactly is the output supposed to show? When I use the write to DOM feature, I am flooded with a wall of large red text. Is this supposed to happen? There are no alerts, but there are obviously style tags being inserted.

Is this the expected behaviour? If so, I think a note about expected behaviour on that page might be appropriate, as I was expecting something in pure plaintext. Seeing a wall of giant red text made me think "that's probably not supposed to happen."

Firefox, Ubuntu.

Make DOMPurify work in Node.js

Regarding issue #26 and #27, I originally held back Common JS style exports and publishing on npm on purpose, as DOMPurify doesn't run on a pure Node.js environment (it does client side with Browserify).

I'm still looking for a way to get it to work on Node.js as well. jsdom lacks DOM Level 2 Traversal methods like createNodeIterator at the moment, which DOMPurify uses internally.

What's currently missing:

  • document.implementation.createHTMLDocument
    (polyfilled with return jsdom('<html><body></body></html>');)
  • `window.NodeFilter (extracted the properties needed for DOMPurify from the spec)
  • document.createNodeIterator(root, whatToShow, filter, entityReferenceExpansion)
  • NodeIterator.nextNode() implementation
  • Setter method for document.body.outerHTML

REVIEW: Fix for bypass in Firefox thanks to a newly discovered mXSS

Hi @neilj and @fhemberger!

Today I ran into a mXSS issue on latest Gecko that causes a bypass under certain conditions. The problem is, that Firefox shows different behavior for innerHTML interaction than any other browser when doing that in an SVG context. I talked to @freddyb of Mozilla and we developed a fix.

The mXSS bug in Firefox causes a bypass when the sanitized HTML is later not being applied with innerHTML but with document.write() or alike. From a security standpoint, I find this to be close to critical. The cause for this issue is a parser behavior change in Gecko when dealing with HTML elements inside inline SVG documents. I will publish the attack vector after the fix has been reviewed (contact me offline for a PoC).

This fix might however be breaking so I'd love to hear your opinion. The tests are green and things look okay - but better to have that one triple-checked:

Changeset
https://github.com/cure53/DOMPurify/compare/DOMParser?expand=1

Test Results
https://travis-ci.org/cure53/DOMPurify/builds/80807894

Opinions are welcome!

Cheers,
.mario

Enhancement suggestion: Interpolation API

TLDR: Secure interpolation API in DOMPurify.
Usage:

interpolate`<a href='${href}' title='${title}'>${text}</a>`

Returns a safe escaping of the variables href, title and text according to their context.

Implementation:
Using ES6 template strings (and possibly a transpiler to make this work in non-ES6 environments), the interpolation function gets the context in which the variables are used. The function can then construct a safe clone of the template, e.g. <a href='1' title='2'>3</a>. When doing this, a hash table of identifiers and variable names is built (1: href, 2: title, 3: text).
This template may be safely inspected without any DOM clobbering concerns (Note: the thread model is that the template is safe and the variables are not).

Parsing & walking this template should be done in the typical DOMPurify style and pause whenever a placeholder value is seen (this can be achieved by looking at the hash table). The parser is stateful and DOM-aware, so it knows the current context in which the placeholder occurs and escapes accordingly. Contexts-aware escaping notes that combinations of attribute and tag need their own sanitizing function, e.g. URL escaping for href, disallowing javascript: etc.

When the template DOM has been completely parsed and each placeholder value has been replaced with an escaped input value, it's being transformed back into a string and returned.

Graceful degradation in case of legacy browser?

We might want to think about what happens, when the browser running a purified website is not supporting what DOMPurify needs.

Should we just return the original string? Or are we in a position to modify something?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.