简体   繁体   中英

shortcut to escaping to prevent XSS

I've just discovered that my website (html/php) is vulnerable to XSS attacks.
Is there any way to sanitize my data besides manually adding htmlspecialchars to each individual variable that I send to the webpage (and proably missing a few thereby leaving it still open to attack)?

No, there is no shortcut. Data escaping always needs to happen on a case by case basis; not only with regards to HTML, but to any other textual format as well (SQL, JSON, CSV, whathaveyou). The "trick" is use tools which do not require you to think about this much and hence may allow you to "miss" something. If you're just echo ing strings into other strings, you're working at the bare metal level and you do need a lot of conscious effort to escape everything. The generally accepted alternative is to use a templating language which implicitly escapes everything.

For example, Twig :

The PHP language is verbose and becomes ridiculously verbose when it comes to output escaping:

 <?php echo $var ?> <?php echo htmlspecialchars($var, ENT_QUOTES, 'UTF-8') ?> 

In comparison, Twig has a very concise syntax, which make templates more readable:

 {{ var }} {{ var|escape }} {{ var|e }} {# shortcut to escape a variable #} 

To be on the safe side, you can enable automatic output escaping globally or for a block of code:

 {% autoescape true %} {{ var }} {{ var|raw }} {# var won't be escaped #} {{ var|escape }} {# var won't be doubled-escaped #} {% endautoescape %} 

This still lets you shoot yourself in the foot, but is a lot better.

One step up still is PHPTAL :

 <div class="item" tal:repeat="value values"> <div class="title"> <span tal:condition="value/hasDate" tal:replace="value/getDate"/> <a tal:attributes="href value/getUrl" tal:content="value/getTitle"/> </div> <div id="content" tal:content="value/getContent"/> </div> 

It requires you to write valid HTML simply to compile the template, and the template engine is fully aware of HTML-syntax and will process all user data at the level of a DOM, instead of a string soup. This relegates HTML to a pure serialisation format (which it should be anyway) which is produced by a serialiser whose only job it is to turn an object oriented data structure into text. There's no way to mess up that syntax through bad escaping.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM