简体   繁体   English

OWASP HTML Sanitizer允许在HTML中使用冒号

[英]OWASP HTML Sanitizer allow colon in HTML

How can I allow : sign in sanitized HTML? 我如何允许:登录已清理的HTML? I am using it to sanitize HTML code in generating java mail. 我用它来生成java邮件时清理HTML代码。 This code has an inline image content id like <img src=\\"cid:image\\" height=\\"70\\" width=\\"70\\" /> . 此代码具有内嵌图像内容ID,如<img src=\\"cid:image\\" height=\\"70\\" width=\\"70\\" /> Upon sanitizing, the src attribute is not included in the sanitized html. 在清理时, src属性不包含在已清理的html中。

    PolicyFactory IMAGES = new HtmlPolicyBuilder().allowUrlProtocols("http", "https")
            .allowElements("img")
            .allowAttributes("src").matching(Pattern.compile("^cid[:][\\w]+$"))
            .onElements("img")
            .allowAttributes("border", "height", "width").onElements("img")
            .toFactory();

    String html = "<img src=\"cid:image\"  height=\"70\" width=\"70\" />";
    final String sanitized = IMAGES.sanitize(html);

    System.out.println(sanitized);

The output of above code is: 上面代码的输出是:

<img height="70" width="70" />

Why it isn't working 为什么它不起作用

Or rather, why it's working "too well" 或者更确切地说,为什么它“工作得太好”

By default, HtmlPolicyBuilder disallows URL protocols in src elements. 默认情况下, HtmlPolicyBuilder不允许src元素中的URL协议。 This prevents injections such as 这可以防止注射等

<img src="javascript:alert('xss')"/>

which could potentially lead to the execution of the script after javascript: (in this case, alert('xss') ) 这可能会导致javascript:后执行脚本javascript:在这种情况下, alert('xss')

There are other protocols (on other elements) that can lead to similar issues: 还有其他协议(在其他元素上)可能会导致类似的问题:

Even though it doesn't use the javascript protocol, it's still possible to inject a base64-encoded XSS injection: 即使它不使用javascript协议,仍然可以注入base64编码的XSS注入:

<object src="data:text/html;base64,PHNjcmlwdD5hbGVydCgneHNzJyk8L3NjcmlwdD4="/> 

or 要么

 <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgneHNzJyk8L3NjcmlwdD4=">Click me</a> 

Because of this, HtmlPolicyBuilder assumes that any attribute value containing a colon (in certain attributes) should be treated as dangerous. 因此, HtmlPolicyBuilder假定包含冒号(在某些属性中)的任何属性值都应被视为危险。


How to fix it: 如何解决:

You have to explicitly tell the HtmlPolicyBuilder to allow the cid "protocol", using the allowUrlProtocols method : 您必须使用allowUrlProtocols方法明确告诉HtmlPolicyBuilder允许cid “协议”:

    PolicyFactory IMAGES = new HtmlPolicyBuilder().allowUrlProtocols("http", "https")
            .allowElements("img")
            .allowUrlProtocols("cid") // Specifically allow "cid"
            .allowAttributes("src").matching(Pattern.compile("^cid[:][\\w]+$"))
            .onElements("img")
            .allowAttributes("border", "height", "width").onElements("img")
            .toFactory();

    String html = "<img src=\"cid:image\"  height=\"70\" width=\"70\" />";
    final String sanitized = IMAGES.sanitize(html);

    System.out.println(sanitized);

Output: 输出:

<img src="cid:image" height="70" width="70" />

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM