简体   繁体   中英

How to output HTML but prevent XSS attacks

I wrote a php script to fetch the email content.

These contents are HTML format.

I'd like to display the content, as below

<?php 
$email_content = '
    <html>
        <script>alert("XSS");</script>
        <body>
            <div>Line1</div>
            <div>Line2</div>
        </body>
    </html>
';
echo $email_content;
?>

As you can see, it will cause XSS attacks. But if I use htmlspecialchars function, it will not show the correct HTML format, how should I do in this case? Thanks.

HTMLPurifer can do that:

require_once '/path/to/HTMLPurifier.auto.php';

$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html);

It takes dirty HTML (ie possibly containing Javascript) and removes any script.

PHP doesn't have anything native or built in that can remove Javacript like HTMLPurifier. You could use DOMDocument but this would be a lengthy task because Javascript can execute in some attributes (onerror, onclick) and is not just limited to <script></script> .

You should use strip_tags() function and allow only tags that you want user to add.

echo strip_tags($text, '<p><a>');

This line allows <p> and <a> tags every other tag will be removed.

htmlspecialchars() works totally different.

From manual :

The translations performed are:

 '&' (ampersand) becomes '&amp;'
 '"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set.
 "'" (single quote) becomes '&#039;' (or &apos;) only when ENT_QUOTES is set.
 '<' (less than) becomes '&lt;'
 '>' (greater than) becomes '&gt;'

There is very nice article about XSS prevention and CSRF prenvetion read it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM