简体   繁体   English

清理filter_var PHP字符串,但保留“'

[英]Sanitize filter_var PHP string but keep " '

I am sanitizing a contact form string : 我正在清理联系表单字符串:

$note = filter_var($_POST["note"], FILTER_SANITIZE_STRING);

Which works great except when people write in inches (") and feet ('). So I'm interested in 5" 8" 10" & 1' comes up as I'm interested in 5" 8" 10" & 1' 除了人们用英寸(“)和英尺(')书写外, I'm interested in 5" 8" 10" & 1' 。所以I'm interested in 5" 8" 10" & 1' I'm interested in 5" 8" 10" & 1'因为I'm interested in 5" 8" 10" & 1' I'm interested in 5" 8" 10" & 1' Which is a bit of a garbled mess. 这有点乱七八糟。

Can I sanitize yet keep my I'm 5'9"? 我可以消毒但保持5'9”的身高吗?

Computer data itself is neither harmful nor innocuous. 计算机数据本身既无害也无害。 It's just a piece of information that can be later be used for a given purpose. 这只是一条信息,以后可用于特定目的。

Sometimes, data is used as computer source code and such code eventually leads to physical actions (a disk spins, a led blinks, a picture is uploaded to remote computer, a thermostat turns off the boiler...). 有时,数据被用作计算机源代码,而这些代码最终会导致物理动作(磁盘旋转,指示灯闪烁,图片上传到远程计算机,恒温器关闭锅炉……)。 And it's then (and only then) when data can become harmful; 然后(只有那时)数据才可能变得有害。 we even lose expensive space ships now and then because of software bugs. 由于软件错误,我们有时甚至失去昂贵的太空船。

Code you write yourself can be as harmful or innocuous as your abilities or good faith dictate. 您编写的代码可能会像您的能力或真诚所指示的那样有害或无害。 The big problem comes when your application has a vulnerability that allows execution of untrusted third-party code. 当您的应用程序具有允许执行不受信任的第三方代码的漏洞时,就会出现大问题。 This is particularly serious in web applications, which are connected to the open internet and are expected to receive data from anywhere in the world. 这在连接到开放式Internet的Web应用程序中尤为严重,并有望从世界任何地方接收数据。 But, how's that physically possible? 但是,这在身体上怎么可能? There're several ways but the most typical case is due to dynamically generated code and this happens all the time in modern www. 有几种方法,但是最典型的情况是由于动态生成的代码,这种情况一直在现代www中发生。 You use PHP to generate SQL, HTML, JavaScript... If you pick untrusted arbitrary data (eg an URL parameter or a form field) and use it to compose code that will later be executed (either by your server or by the visitor's browser) someone can be hacked (either you or your users). 您使用PHP生成SQL,HTML,JavaScript ...如果您选择不受信任的任意数据(例如URL参数或表单字段),并使用它来编写稍后将被执行的代码(由服务器或访问者的浏览器执行) )(您或您的用户)可能遭到黑客入侵。

You'll see that everyday here at Stack Overflow: 您每天都会在Stack Overflow上看到这一点:

$username = $_POST["username"];
$row = mysql_fetch_array(mysql_query("select * from users where username='$username'"));
<td><?php echo $row["title"]; ?></td>
var id = "<?php echo $_GET["id"]; ?>";

Faced to this problem, some claim: let's sanitize! 面对这个问题,有人声称:让我们消毒吧! It's obvious that some characters are evil so we'll remove them all and we're done, right? 很明显,有些角色是邪恶的,所以我们将其全部删除,然后完成,对吧? And then we see stuff like this: 然后我们看到这样的东西:

$username = $_POST["username"];
$username = strip_tags($username);
$username = htmlentities($username);
$username = stripslashes($username);
$row = mysql_fetch_array(mysql_query("select * from users where username='$username'"));

This is a surprisingly widespread misconception adopted even by some professionals. 即使是某些专业人士,这也是一个令人惊讶的普遍误解。 You see the symptoms everywhere: your comment is mutilated at first < symbol, you get "your password cannot contain spaces" on sign-up and you read Why can't I use certain words like "drop" as part of my Security Question answers? 您到处都可以看到症状:注册时,您的注释最初是<符号,它在注册时会显示“您的密码不能包含空格”,并且您阅读了为什么我不能在安全性问题答案中使用诸如“ drop”之类的某些字词? in the FAQ. 在常见问题解答中。 It's even inside computer languages: whenever you read "sanitize", "escape"... in a function name (without further context), you have a good hint that it might be a misguided effort. 它甚至在计算机语言内部:每当您在函数名称中(无其他上下文)阅读“ sanitize”,“ escape” ...时,都可以很好地暗示这可能是一种误导。

It's all about establishing a clear separation about data and code: user provides data but only you provide code. 这一切都是关于将数据和代码建立清楚的分离:用户提供数据,但只有您提供代码。 And there isn't a universal one-size-fits-all solution because each computer language has its own syntax and rules. 而且,由于每种计算机语言都有自己的语法和规则,因此没有通用的“一刀切”解决方案。 DROP TABLE users; can be terribly dangerous in SQL: 在SQL中可能非常危险:

mysql> DROP TABLE users;
Query OK, 56020 rows affected (0.52 sec)

(oops!)... but it's not as bad in eg JavaScript. (哎呀!)...但是在JavaScript方面还不错。 Look, it doesn't even run: 看,它甚至没有运行:

C:\>node
> DROP TABLE users;
SyntaxError: Unexpected identifier
    at Object.exports.createScript (vm.js:24:10)
    at REPLServer.defaultEval (repl.js:235:25)
    at bound (domain.js:287:14)
    at REPLServer.runBound [as eval] (domain.js:300:12)
    at REPLServer.<anonymous> (repl.js:427:12)
    at emitOne (events.js:95:20)
    at REPLServer.emit (events.js:182:7)
    at REPLServer.Interface._onLine (readline.js:211:10)
    at REPLServer.Interface._line (readline.js:550:8)
    at REPLServer.Interface._ttyWrite (readline.js:827:14)
>

This last example also illustrates that it's not only a security concern. 最后一个示例还说明,这不仅是安全问题。 Even if you're not being hacked, generating code from random input can simply make your app crash: 即使您没有被黑客入侵,从随机输入生成代码也会使您的应用程序崩溃:

 SELECT * FROM customers WHERE last_name='O'Brian'; 

You have an error in your SQL syntax; 您的SQL语法有误; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Brian'' 检查与您的MySQL服务器版本相对应的手册以获取在'Brian'附近使用的正确语法

So, what shall be done then if there isn't a universal solution? 那么,如果没有通用解决方案,那该怎么办呢?

  1. Understand the problem: 了解问题:

    If you inject raw literal data improperly it can become code (and sometimes invalid code). 如果您不正确地注入原始文字数据,它可能会成为代码(有时甚至是无效的代码)。

  2. Use the specific mechanism for each technology: 对每种技术使用特定的机制:

    If target language requires escaping: 如果目标语言需要转义:

    <p><3 to code</p><p>&lt;3 to code</p> <p><3 to code</p><p>&lt;3 to code</p>

    ... find a specific tool to escape in source language: ...找到一种以源语言进行转义的特定工具:

     echo '<p>' . htmlspecialchars($motto) . '</p>'; 

    If language/framework/technology allows to send data in a separate channel, do it: 如果语言/框架/技术允许在单独的渠道中发送数据,请执行以下操作:

      $sql = 'SELECT password_hash FROM user WHERE username=:username'; $params = array( 'username' => $username, ); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM