简体繁体 English

在Java Web应用程序中防止SQL注入攻击和XSS的方法

[英]Ways to prevent SQL Injection Attack & XSS in Java Web Application

原文 2009-01-27 20:12:16 2 6 java/ regex/ xss/ sql-injection

I'm writing a java class which would be invoked by a servlet filter and which checks for injection attack attempts and XSS for a java web application based on Struts. 我正在编写一个java类，它将由servlet过滤器调用，并检查基于Struts的java Web应用程序的注入攻击尝试和XSS。 The InjectionAttackChecker class uses regex & java.util.regex.Pattern class to validate the input against the patterns specified in regex. InjectionAttackChecker类使用regex和java.util.regex.Pattern类来根据regex中指定的模式验证输入。

With that said, I have following questions: 话虽如此，我有以下问题：

What all special characters and character patterns (for example <>, . , --, <=, ==,>=) should be blocked so that injection attack could be prevented. 应该阻止所有特殊字符和字符模式（例如<>，。， - ，<=，==，> =）以防止注入攻击。
Is there any existing regex pattern which I could use as is? 是否存在我可以使用的现有正则表达式模式？
I have to allow some of the special character patterns in some specific cases, some example values (to be allowed) are (used 'pipe' | character as a separator of different values) *Atlanta | 我必须在某些特定情况下允许一些特殊字符模式，一些示例值（允许）是（使用'pipe'|字符作为不同值的分隔符）* Atlanta | #654,BLDG 8 #501 | ＃654，BLDG 8＃501 | Herpes simplex: chronic ulcer(s) (>1 mo. duration) or bronchitis, pneumonitis, or esophagitis | 单纯疱疹：慢性溃疡（> 1个月持续时间）或支气管炎，肺炎或食道炎 FUNC & COMP(date_cmp), "NDI & MALKP & HARS_IN(icd10, yes)" . FUNC＆COMP（date_cmp），“NDI＆MALKP＆HARS_IN（icd10，yes）”。 What strategy should I adopt so that injection attack and XSS could be prevented but still allowing these character patterns. 我应该采取什么策略，以便可以防止注入攻击和XSS，但仍然允许这些字符模式。

I hope I have mentioned the question clearly. 我希望我已经清楚地提到了这个问题。 But if I didn't, I apologize as its just my 2nd question. 但如果我没有，我道歉只是我的第二个问题。 Please let me know if any clarification is needed. 如果需要澄清，请告诉我。

6 个解决方案

Based on your questions I am assuming you are attempting to filtering bad values. 基于您的问题，我假设您正在尝试过滤不良值。 I personally feel that this method can get very complex very quickly and would recommend encoding values as an alternate method. 我个人认为这种方法可以非常快速地变得非常复杂，并建议将编码值作为替代方法。 Here is an IBM article on the subject that lays out the pros and cons of both methods, http://www.ibm.com/developerworks/tivoli/library/s-csscript/ . 以下是关于该主题的IBM文章，其中列出了两种方法的优缺点， http：//www.ibm.com/developerworks/tivoli/library/s-csscript/ 。

To avoid SQL injection attacks just use prepared statements instead of creating SQL strings. 要避免SQL注入攻击，只需使用预准备语句而不是创建SQL字符串。

If you attempt to sanitize all the data on input, you're going to have a very difficult time of it. 如果您尝试清理输入中的所有数据，那么您将面临非常困难的时间。 There are tons of tricks involving character encoding and such that will allow people to circumvent your filters. 有大量涉及字符编码的技巧，这样可以让人们绕过你的过滤器。 This impressive list is only some of the myriad things that can be done as SQL injections. 这个令人印象深刻的列表只是SQL注入可以完成的一些无数事情。 You've also got to prevent HTML injection, JS injection, and potentially others. 您还必须防止HTML注入，JS注入以及其他可能的注入。 The only sure way of doing this is to encode the data where it is used in your application. 唯一确定的方法是对应用程序中使用的数据进行编码。 Encode all the output you write to your web site, encode all of your SQL parameters. 对您写入网站的所有输出进行编码，编码所有SQL参数。 Be especially careful with the latter, as normal encoding will not work for non-string SQL parameters, as explained in that link. 请特别注意后者，因为正常编码不适用于非字符串SQL参数，如该链接中所述。 Use parameterized queries to be completely safe. 使用参数化查询是完全安全的。 Also note that you could theoretically encode your data at the time the user enters it and store it encoded in the database, but that only works if you're always going to be using the data in ways that use that type of encoding (ie HTML encoding if it will only ever be used with HTML; if it's used in SQL, you're not going to be protected). 另请注意，理论上您可以在用户输入数据时对数据进行编码并将其编码存储在数据库中，但只有在您始终以使用该类型编码的方式使用数据时（仅限HTML）编码，如果它只会与HTML一起使用;如果它在SQL中使用，你就不会受到保护）。 This is partially why the rule of thumb is to never store encoded data in the database and always encode on use. 这就是为什么经验法则永远不会将编码数据存储在数据库中并始终在使用时编码的部分原因。

Validating and binding all data is a must. 验证和绑定所有数据是必须的。 Perform both client-side and server-side validatation, because 10% of people turn off JavaScript in their browsers. 执行客户端和服务器端验证，因为10％的人在浏览器中关闭JavaScript。

Jeff Atwood has a nice blog about the topic that gives you a flavor for its complexity. 杰夫阿特伍德有一个很好的博客关于这个主题，让你了解它的复杂性。

Here's a pretty extensive article on that very subject. 这是一篇非常广泛的文章。

I don't think you'll have a holy grail here though. 我不认为你在这里会有圣杯。 I would also suggest trying to encode/decode the received text in some standard ways (uuencode, base64) 我还建议尝试以某种标准方式对接收到的文本进行编码/解码（uuencode，base64）

don't filter or block values. 不过滤或阻止值。

you should ensure that when combining bits of text you do the proper type conversions :) ie: if you have a piece a string which is type HTML and a string which is type TEXT you should convert TEXT to HTML instead of blindly concatenating them. 你应该确保在组合文本时你做正确的类型转换:)即：如果你有一个字符串是HTML类型的字符串和一个TEXT类型的字符串你应该将TEXT转换为HTML而不是盲目连接它们。 in haskell you can conveniently enforce this with the type system. 在haskell中，您可以使用类型系统方便地执行此操作。

good html templating languages will escape by default. 好的html模板语言默认会被转义。 if you are generating XML/HTML then sometimes it is better to use DOM tools than a templating language. 如果您正在生成XML / HTML，那么有时候使用DOM工具比使用模板语言更好。 if you use a DOM tool then it removes a lot of these issues. 如果你使用DOM工具，那么它会消除很多这些问题。 unfortunately, DOM tool is usually crap compared to templating :) 不幸的是，与模板相比，DOM工具通常是垃圾:)

if you take strings of type HTML from users you should sanitize it with a library to remove all not-good tags/attributes. 如果你从用户那里获取HTML类型的字符串，你应该用库清理它以删除所有不好的标签/属性。 there are lots of good whitelist html filters out there. 那里有很多好的白名单html过滤器。
you should always use parameterized queries. 你应该总是使用参数化查询。 ALWAYS! 总是！ if you have to build up queries dynamically then build them up dynamically with parameters. 如果必须动态构建查询，则使用参数动态构建它们。 don't ever combine non-SQL typed strings with SQL typed strings. 不要将非SQL类型的字符串与SQL类型的字符串组合在一起。

Take a look at the AntiSamy project [www.owasp.org] . 看一下AntiSamy项目[www.owasp.org] 。 I think it is exactly what you want; 我认为这正是你想要的; you can setup a filter to block certain tags. 您可以设置过滤器来阻止某些标记。 They also supply policy templates, the slashdot policy would be a good start, then add on the tags you require. 他们还提供策略模板，slashdot策略将是一个良好的开端，然后添加您需要的标签。