简体   繁体   English

htmlentities设置为“ UTF-8”时,图片未上传

[英]Images not uploading when htmlentities has 'UTF-8' set

I have a form that, among other things, accepts an image for upload and sticks it in the database. 我有一个表单,其中包括接受要上传的图像并将其粘贴到数据库中。 Previously I had a function filtering the POSTed data that was basically: 以前,我有一个过滤POSTed数据的函数,该函数基本上是:

function processInput($stuff) {
    $formdata = $stuff;
    $formdata = htmlentities($formdata, ENT_QUOTES);
    return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}

When, in an effort to fix some weird entities that weren't getting converted properly I changed the function to (all that has changed is I added that 'UTF-8' bit in htmlentities): 为了修复某些未正确转换的怪异实体时,我将功能更改为(更改的只是我在htmlentities中添加了“ UTF-8”位):

function processInput($stuff) {
        $formdata = $stuff;
        $formdata = htmlentities($formdata, ENT_QUOTES, 'UTF-8'); //added UTF-8
        return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
    }

And now images will not upload. 现在,图像将不会上传。

What would be causing this? 是什么原因造成的? Simply removing the 'UTF-8' bit allows images to upload properly but then some of the MS Word entities that users put into the system show up as gibberish. 只需删除“ UTF-8”位,即可正确上传图像,但随后,用户放入系统中的某些MS Word实体就会显示为乱码。 What is going on? 到底是怎么回事?

**EDIT: Since I cannot do much to change the code on this beast I was able to slap a bandaid on by using htmlspecialchars() rather than htmlentities() and that seems to at least leave the image data untouched while converting things like quotes, angle brackets, etc. bobince's advice is excellent but in this case I cannot now spend the time needed to fix the messy legacy code in this project. **编辑:由于我不能做太多更改此野兽上的代码,因此我可以使用htmlspecialchars()而不是htmlentities()来贴上创可贴,这似乎至少使图像数据保持不变,同时转换了引号等内容,尖括号等。bobince的建议非常好,但是在这种情况下,我现在无法花费时间来修复该项目中的混乱旧版代码。 Most stuff I deal with is object oriented and framework based but now I see first hand what people mean when they talk about "spaghetti code" in PHP. 我处理的大多数内容都是面向对象的和基于框架的,但是现在我亲眼看到了人们谈论PHP中的“意大利面条式代码”时的意思。

function processInput($stuff) {
    $formdata = $stuff;
    $formdata = htmlentities($formdata, ENT_QUOTES);
    return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}

This function represents a basic misunderstanding of string processing, one common to PHP programmers. 此函数表示字符串处理的基本误解,这是PHP程序员常见的一种。

SQL-escaping, HTML-escaping and input validation are three separate functions, to be used at different stages of your script. SQL转义,HTML转义和输入验证是三个独立的函数,将在脚本的不同阶段使用。 It makes no sense to try to do them all in one go; 一口气尝试全部完成是没有意义的。 it will only result in characters that are 'special' to any one of the processes getting mangled when used in the other parts of the script. 当在脚本的其他部分中使用时,只会导致对任何一个进程“特殊”的字符变得混乱。 You can try to tinker with this function to try to fix mangling in one part of the app, but you'll break something else. 您可以尝试修改此功能,以尝试修复应用程序一部分中的错误,但会破坏其他功能。

Why are images being mangled? 为什么图像被扭曲? Well, it's not immediately clear via what path image data is going from a $_FILES temporary upload file to the database. 好吧,目前还不清楚通过什么路径将图像数据从$_FILES临时上传文件传输到数据库。 If this function is involved at any point though, it's going to completely ruin the binary content of an image file. 如果在任何时候都涉及到此功能,它将完全破坏图像文件的二进制内容。 Backslashes removed and HTML-escaped... no image could survive that. 删除了反斜杠并转义了HTML……没有图像可以幸免。

  1. mysql_real_escape_string is for escaping some text for inclusion in a MySQL string literal. mysql_real_escape_string用于转义一些包含在MySQL字符串文字中的文本。 It should be used always-and-only when making an SQL string literal with inserted text, and not globally applied to input. 当使用插入的文本制作SQL字符串文字时,应始终仅将其使用, 而不应将其全局应用于输入。 Because some things that come in in the input aren't going immediately or solely to the database. 因为输入中包含的某些内容不会立即或仅进入数据库。 For example, if you echo one of the input values to the HTML page, you'll find you get a bunch of unwanted backslashes in it when it contains characters like ' . 例如,如果您echo输入值到HTML页面中的一个,你会发现你得到了一堆不必要的反斜线在它时,它包含了诸如字符' This is how you end up with pages full of runaway backslashes. 这就是您最终得到充满反斜杠的页面的方式。

    (Even then, parameterised queries are generally preferable to manual string hacking and mysql_real_escape_string . They hide the details of string escaping from you so you don't get confused by them.) (即使那样,参数化查询通常也比手动字符串黑客攻击和mysql_real_escape_string更好。它们会向您隐藏字符串转义的详细信息,以免您被它们混淆。)

  2. htmlentities is for escaping text for inclusion in an HTML page. htmlentities用于转义包含在HTML页面中的文本。 It should be used always-and-only in the output templating bit of your PHP. 应该在PHP的输出模板位中始终使用它。 It is inappropriate to run it globally over all your input because not everything is going to end up in an HTML page or solely in an HTML page, and most probably it's going to go to the database first where you absolutely don't want a load of < 在所有输入上全局运行它是不合适的,因为并非所有内容都将最终显示在HTML页面中或仅在HTML页面中显示,并且很可能首先将其转到您绝对不希望加载的数据库中的< and & & rubbish making your text fail to search or substring reliably. 垃圾使您的文本无法可靠搜索或子字符串化。

    (Even then, htmlspecialchars is generally preferable to htmlentities as it only encodes the characters that really need it. htmlentities will add needless escaping, and unless you tell it the right encoding it'll also totally mess up all your non-ASCII characters. htmlentities should almost never be used.) (即使那样, htmlspecialchars通常也比htmlentities更可取,因为它只编码真正需要它的字符htmlentities将添加不必要的转义,除非您告诉它正确的编码,否则它也将完全弄乱您所有的非ASCII字符htmlentities几乎不应该使用。)

  3. As for stripslashes ... well, you sometimes need to apply that to input, but only when the idiotic magic_quotes_gpc option is turned on. 至于stripslashes ……好吧, 有时您需要将其应用到输入中,但是仅当启用了惯用的magic_quotes_gpc选项时。 You certainly shouldn't apply it all the time, only when you detect magic_quotes_gpc is on. 当然,只有当您检测到magic_quotes_gpc处于打开状态时,您才应该一直应用它。 It is long deprecated and thankfully dying out, so it's probably just as good to bomb out with an error message if you detect it being turned on. 长期以来,它已被弃用,并且很容易消亡,因此,如果检测到错误消息已被打开,则可能同样有用。 Then you could chuck the whole processInput thing away. 然后,你可以扔掉整个processInput事情了。

To summarise: 总结一下:

  • At start time, do no global input processing. 在启动时,做没有全球性的输入处理。 You can do application-specific validation here if you want, like checking a phone number is just numbers, or removing control characters from text or something, but there should be no escaping happening here. 如果需要,您可以在此处进行特定于应用程序的验证,例如检查电话号码只是数字,还是从文本或其他内容中删除控制字符,但此处不应进行转义。

  • When making an SQL query with a string literal in it, use SQL-escaping on the value as it goes into the string: $query= "SELECT * FROM t WHERE name='".mysql_real_escape_string($name)."'"; 当在其中使用字符串文字的SQL查询时,对字符串中的值使用SQL转义: $query= "SELECT * FROM t WHERE name='".mysql_real_escape_string($name)."'"; . You can define a function with a shorter name to do the escaping to save some typing. 您可以使用较短的名称定义一个函数,以进行转义以节省一些键入内容。 Or, more readably, parameterisation. 或者更可读地是参数化。

  • When making HTML output with strings from the input or the database or elsewhere, use HTML-escaping, eg.: <p>Hello, <?php echo htmlspecialchars($name); ?>!</p> 使用来自输入或数据库或其他地方的字符串进行HTML输出时,请使用HTML转义,例如: <p>Hello, <?php echo htmlspecialchars($name); ?>!</p> <p>Hello, <?php echo htmlspecialchars($name); ?>!</p> . <p>Hello, <?php echo htmlspecialchars($name); ?>!</p> Again, you can define a function with a short name to do echo htmlspecialchars to save on typing. 同样,您可以定义一个带有短名称的函数以echo htmlspecialchars以节省键入内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM