简体   繁体   中英

OWASP HTML Sanitizer allow colon in HTML

How can I allow : sign in sanitized HTML? I am using it to sanitize HTML code in generating java mail. This code has an inline image content id like <img src=\\"cid:image\\" height=\\"70\\" width=\\"70\\" /> . Upon sanitizing, the src attribute is not included in the sanitized html.

    PolicyFactory IMAGES = new HtmlPolicyBuilder().allowUrlProtocols("http", "https")
            .allowElements("img")
            .allowAttributes("src").matching(Pattern.compile("^cid[:][\\w]+$"))
            .onElements("img")
            .allowAttributes("border", "height", "width").onElements("img")
            .toFactory();

    String html = "<img src=\"cid:image\"  height=\"70\" width=\"70\" />";
    final String sanitized = IMAGES.sanitize(html);

    System.out.println(sanitized);

The output of above code is:

<img height="70" width="70" />

Why it isn't working

Or rather, why it's working "too well"

By default, HtmlPolicyBuilder disallows URL protocols in src elements. This prevents injections such as

<img src="javascript:alert('xss')"/>

which could potentially lead to the execution of the script after javascript: (in this case, alert('xss') )

There are other protocols (on other elements) that can lead to similar issues:

Even though it doesn't use the javascript protocol, it's still possible to inject a base64-encoded XSS injection:

<object src="data:text/html;base64,PHNjcmlwdD5hbGVydCgneHNzJyk8L3NjcmlwdD4="/> 

or

 <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgneHNzJyk8L3NjcmlwdD4=">Click me</a> 

Because of this, HtmlPolicyBuilder assumes that any attribute value containing a colon (in certain attributes) should be treated as dangerous.


How to fix it:

You have to explicitly tell the HtmlPolicyBuilder to allow the cid "protocol", using the allowUrlProtocols method :

    PolicyFactory IMAGES = new HtmlPolicyBuilder().allowUrlProtocols("http", "https")
            .allowElements("img")
            .allowUrlProtocols("cid") // Specifically allow "cid"
            .allowAttributes("src").matching(Pattern.compile("^cid[:][\\w]+$"))
            .onElements("img")
            .allowAttributes("border", "height", "width").onElements("img")
            .toFactory();

    String html = "<img src=\"cid:image\"  height=\"70\" width=\"70\" />";
    final String sanitized = IMAGES.sanitize(html);

    System.out.println(sanitized);

Output:

<img src="cid:image" height="70" width="70" />

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM