Platform/HTML5 sanitizer: Difference between revisions

m
 
(19 intermediate revisions by 2 users not shown)
Line 3: Line 3:
* Allow a setting for enabling styles.
* Allow a setting for enabling styles.
* Allow a setting for enabling comments. See {{bug|572642}}
* Allow a setting for enabling comments. See {{bug|572642}}
* Have a white list of elements.
** Or always enable comments? (What about "--" in comments?)
* Have a white list of attributes. The attributes don't depend on the element they are on.
* <s>Have three element white lists: HTML, SVG and MathML.</s>
* Have a list of attributes that take URLs. Drop the attributes when they have prohibited URLs (after trimming whitespace from the value).
** This turns out to lead to a lot of complexity without clear benefit.
** Resolve relative URLs into absolute ones using a per fragment base URL. (Is this correct for Gecko reqs?)
* Have three attribute white lists: HTML, SVG and MathML. The attributes don't depend on the element they are on beyond the element namespace.
** Why is whitespace trimmed before the security check?
** XXX: Figure out what the requirements are for attributes starting with data- or _.
** However, allow any URL in the src attribute on the img element, because imgs are safe.
* Have three lists of attributes that take URLs. Drop the attributes when they have prohibited URLs (after trimming whitespace from the value).
*** Why risk this?
** Resolve relative URLs into absolute ones using a per fragment base URL. (Is this correct for Gecko reqs? Current code uses the node's base URI. Is that right?)
** However, allow any URL in the src attribute on the img element, because imgs are safe. {{bug|572637}}
* Have a list of SVG attributes that take different-document references.
* Have a list of SVG attributes that are allowed to have same-document references only.
* If styles are allowed, sanitize style attribute values. If styles aren't allowed, drop the style attribute.
* If styles are allowed, sanitize style attribute values. If styles aren't allowed, drop the style attribute.
* Always drop script elements and their contents.
* Always drop script and title elements and their contents.
* If styles are disabled, drop style elements and their contents.
* If styles are disabled, drop style elements and their contents.
* If styles are enabled, sanitize the content of style elements.
* If styles are enabled, sanitize the content of style elements.
* Add the controls attribute to the video and audio elements (if it isn't there already).


==Open Questions==
==Open Questions==


* Can stylistic SVG attributes have values that need to be sanitized?
* Can stylistic SVG attributes have values that need to be sanitized?
* Should element whitelisting take place after the tree builder algorithm so that the namespace of the element is known?
* Should Semantic MathML be on the white list for clipboard round-tripping? (Mainly a footprint issue.)
** Likely yes.
* Is it dangerous for SVG fragment id references to be able to refer to an id in the document the untrusted fragment gets inserted into?
* What to do about microdata?


==Non-Gecko Requirements==
==Non-Gecko Requirements==
These are features for the HTML5 parser when it is used outside Gecko.
* Allow form-related elements to be toggled on and off in the white list.
* Allow using the sanitizer in non-fragment mode (in which case, the title element should be allowed).
** Are there compelling use cases for non-fragment mode sanitization?
* Have a configurable white list of permitted URL schemes in attributes that take URLs.
canmove, Confirmed users
2,675

edits