简体   繁体   English

PHP文件上传中未记录的清理

[英]Undocumented sanitization in PHP file upload

When handling file uploads, according to the PHP official documentation , the file name should be sanitised against directory traversal and possibly other kinds of attacks: 根据PHP官方文档 ,在处理文件上传时,应针对目录遍历和其他可能的攻击对文件名进行清理:

// basename() may prevent filesystem traversal attacks;
// further validation/sanitation of the filename may be appropriate
$name = basename($_FILES["pictures"]["name"][$key]);

Despite this, I've found that by default, the file name is already sanitised when it arrives to the PHP script. 尽管如此,我发现默认情况下,文件名在到达PHP脚本时已被清除。

I have evidence that Apache receives the malicious file name: filename="../file.png", while the PHP script reads a sanitised name in the $_FILES variable instead. 我有证据表明Apache收到了恶意文件名:filename =“ ../ file.png”,而PHP脚本改为在$ _FILES变量中读取经过清理的名称。

Low-level dump of Apache input: Apache输入的低级转储:

mod_dumpio: dumpio_in (data-HEAP):
--------------------------eb8b65b665870e02
Content-Disposition: form-data;
name="attachment";
filename="../file.png" ← [Malicious file name]
Content-Type: image/png

PHP script PHP脚本

echo $_FILES['attachment']['name']; ← [File name already sanitised: 'file.png']

I've found this behaviour in both Apache module and php-fpm, running PHP from 5.5 to 7.2, and I have to deduce that the PHP interpreter performs this sanitization before passing the variable to the script. 我已经在Apache模块和php-fpm中发现了此行为,并且将PHP从5.5运行到7.2,并且我必须推断出在将变量传递给脚本之前,PHP解释程序会执行此清理操作。

So, thanks PHP for doing sanitation for me without my knowledge and consent. 因此,感谢PHP未经我的同意和同意为我进行卫生处理。 However (and this is my question) since this feature, as far as I know is undocumented, I'd like to know the sanitisation criteria / regexp / algorithm, to ensure it meets my needs. 但是(这是我的问题) ,因为据我所知,该功能尚未记录,所以我想了解卫生标准/ regexp /算法,以确保其满足我的需求。

You want to look at rfc1867.c , this seems the part you refer to: 您想查看rfc1867.c ,这似乎是您所指的部分:

SAPI_API SAPI_POST_HANDLER_FUNC(rfc1867_post_handler)

From the comment, it appears that basename() is used to get rid of spurious backslashes, which could actually be correct (I imagine perhaps " Hello\\ World.txt "?). 从注释中可以看出, basename()用来消除虚假的反斜杠,而反斜杠实际上是正确的 (我想也许是“ Hello\\ World.txt ”?)。 But this is based on IE's behaviour and the comment states that it might be removed in the future. 但这是基于IE的行为 ,并且评论指出将来可能会删除它。

So you can't rely on this "sanitization" to keep on being there. 因此,您不能依靠这种“消毒”来继续存在。

... ...

    /* The \ check should technically be needed for win32 systems only where
     * it is a valid path separator. However, IE in all it's wisdom always sends
     * the full path of the file on the user's filesystem, which means that unless
     * the user does basename() they get a bogus file name. Until IE's user base drops
     * to nill or problem is fixed this code must remain enabled for all systems. */

    s = _basename(internal_encoding, filename TSRMLS_CC);
    if (!s) {
        s = filename;
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM