简体   繁体   中英

New Way To Prevent XSS Attacks

I have a website related to entertainment. So, I have thought to use a new method to prevent XSS Attack. I have created the following words list

alert(, javascript, <script>,<script,vbscript,<layer>,
<layer,scriptalert,HTTP-EQUIV,mocha:,<object>,<object,
AllowScriptAccess,text/javascript,<link>, <link,<?php, <?import,

I have thought that because my site is related to entertainment, So I do not expect from a normal user (other than malicious user) to use such kind of words in his comment. So, I have decided to remove all the above comma separated words from the user submitted string. I need your advice. Do I no need to use htmlpurifier like tools after doing this?

Note: I am not using htmlspecialchars() because it will also convert the tags generated from my Rich Text Editor (CKEditor), so user formatted will be gone.

Using a black list is a bad idea as it is simple to circumvent. For example, you are checking for and presumably removing <script> . To circumvent this, a malicious user can enter:

<scri<script>pt> 

your code will strip out the middle <script> leaving the outer <script> intact and saved to the page.

If you need to enter HTML and your users do not, then prevent them from entering HTML. You need to have a separate method, only accessible to you, for entering articles that with HTML.

This approach misunderstands what the HTML-injection problem is, and is utterly ineffective.

There are many, many more ways to put scripting in HTML than the above list, and many ways to evade the filter by using escaped forms. You will never catch all potential "harmful" constructs with this kind of naive sequence blacklisting, and if you try you will inconvenience users with genuine comments. (eg banning use of words beginning with on ...)

The correct way to prevent HTML-injection XSS is:

  • use htmlspecialchars() when outputting content that is supposed to be normal text (which is the vast majority of content);

  • if you need to allow user-supplied HTML markup, whitelist the harmless tags and attributes you wish to allow, and enforce that using HTMLPurifier or another similar library.

This is a standard and well-understood part of writing a web application, and is not difficult to implement.

为什么不创建一个函数来恢复htmlspecialchars()为您想要的特定标签所做的更改,例如<b><i><a>等?

Hacks to circumvent your list aside, it's always better to use a whitelist than a blacklist.

In this case, you would already have a clear list of tags that you want to support, so just whitelist tags like <em> , <b> , etc, using some HTML purifier.

you can try with

htmlentities()

echo htmlentities("<b>test word</b>");

ouput: &lt;b&gt;test word&lt;/b&gt;gt;

strip_tags()

echo strip_tags("<b>test word</b>");

ouput: test word

mysql_real_escape_string()

or try a simple function

function clean_string($str) {
  if (!get_magic_quotes_gpc()) {
    $str = addslashes($str);
  }
  $str = strip_tags(htmlspecialchars($str));
  return $str;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM