简体   繁体   English

如何辨别某人伪造文件类型? (PHP)

[英]How do I tell if someone's faking a filetype? (PHP)

I'm programming something that allows users to store documents and pictures on a webserver, to be stored and retrieved later. 我正在编写一些东西,允许用户在网络服务器上存储文档和图片,以便以后存储和检索。 When users upload files to my server, PHP tells me what filetype it is based on the extension. 当用户将文件上传到我的服务器时,PHP会根据扩展名告诉我它是什么文件类型。 However, I'm afraid that users could rename a zip file as somezipfile.png and store it, thus keeping a zip file on my server. 但是,我担心用户可以将zip文件重命名为somezipfile.png并存储它,从而在我的服务器上保留一个zip文件。 Is there any reasonable way to open an uploaded file and "check" to see if it truly is of the said filetype? 有没有合理的方法来打开上传的文件并“检查”以查看它是否真的属于所述文件类型?

Magic number . 幻数 If you can read first few bytes of a binary file you can know what kind of file it is. 如果你可以读取二进制文件的前几个字节,你就可以知道它是什么类型的文件。

查看PHP的FileInfo PECL扩展,它可以为您执行MIME魔术查找。

Sort of. 有点。 Most file types have some bytes reserved for marking them so that you don't have to rely on the extension. 大多数文件类型都保留了一些字节用于标记它们,因此您不必依赖扩展名。 The site http://wotsit.org is a great resource for finding this out for a particular type. 网站http://wotsit.org是一个很好的资源,可以找到特定类型的这个。

If you are on a unix system, I believe that the file command doesn't rely on the extension, so you could shell out to it if you don't want to write the byte checking code. 如果您使用的是unix系统,我相信file命令不依赖于扩展,因此如果您不想编写字节检查代码,可以使用它。

For PNG ( http://www.w3.org/TR/PNG-Rationale.html ) 对于PNG( http://www.w3.org/TR/PNG-Rationale.html

The first eight bytes of a PNG file always contain the following values: PNG文件的前八个字节始终包含以下值:

(decimal) 137 80 78 71 13 10 26 10 (十进制)137 80 78 71 13 10 26 10

(hexadecimal) 89 50 4e 47 0d 0a 1a 0a (十六进制)89 50 4e 47 0d 0a 1a 0a

(ASCII C notation) \\211 PNG \\r \\n \\032 \\n (ASCII C表示法)\\ 211 PNG \\ r \\ n \\ 032 \\ n

许多文件类型在文件的开头都有“ 魔术数字 ”来识别它们。您可以从文件的前面读取一些字节,并将它们与已知幻数列表进行比较。

If you are only dealing with images, then getimagesize() should distinguish a valid image from a fake one. 如果您只处理图像,那么getimagesize()应该区分有效图像和伪图像。

$ php -r 'var_dump(getimagesize("b&n.jpg"));'
array(7) {
  [0]=>
  int(200)
  [1]=>
  int(200)
  [2]=>
  int(2)
  [3]=>
  string(24) "width="200" height="200""
  ["bits"]=>
  int(8)
  ["channels"]=>
  int(3)
  ["mime"]=>
  string(10) "image/jpeg"
}

$ php -r 'var_dump(getimagesize("/etc/passwd"));'
bool(false)

A false value from getimagesize is not an image. 来自getimagesize的错误值不是图像。

在unix系统上,从'file'命令捕获输出应该提供足够的信息。

有关如何在PHP中快速完成此操作的确切答案,请查看以下问题: 如何使用php查找文件的mime类型?

As a side note I ran into a similar problem where I had to do my own type checking. 作为旁注,我遇到了类似的问题,我不得不进行自己的类型检查。 The front end interface to my application was done in flash. 我的应用程序的前端界面是在flash中完成的。 The files were being passed through flash to a php script. 这些文件正在通过flash传递给php脚本。 When I was attempting to do a MIME type check using php the type always returned was application/octetstream because it was coming from flash. 当我尝试使用php进行MIME类型检查时,总是返回的类型是application / octetstream,因为它来自flash。

I had to implement a magic numbers type paradigm. 我必须实现一个神奇的数字类型范例。 I simply created an xml file that held the file type along with some defining patterns found within the beginning of the file. 我只是创建了一个xml文件,其中包含文件类型以及在文件开头找到的一些定义模式。 Once the file reached the server I did some pattern matching with the xml file and then accepted or rejected the file. 一旦文件到达服务器,我做了一些与xml文件匹配的模式,然后接受或拒绝该文件。 I didn't noticed any real performance decrease either which I was expecting. 我没有注意到任何真正的性能下降,这是我期待的。

This is just a side note to anyone who may be using flash as there front end and trying to type check the file once it is uploaded. 对于可能正在使用flash作为前端并尝试在上传文件后键入检查的人,这只是一个附注。

As well as identifying the filetype, you might want to watch out for files with other files embedded or appended to them. 除了识别文件类型之外,您可能还需要注意嵌入或附加了其他文件的文件。 This will unfortunately require a more indepth analysis of the file contents than just using "magic numbers". 遗憾的是,这需要对文件内容进行更深入的分析,而不仅仅是使用“魔术数字”。

For example, http://quantumrook.wordpress.com/2007/06/06/hide-a-rar-file-in-a-jpg-file/ (this particular type of data hiding can be easily worked around by loading and resaving into a new file the actual image data .. others will be more difficult.) 例如, http://quantumrook.wordpress.com/2007/06/06/hide-a-rar-file-in-a-jpg-file/ (这种特殊类型的数据隐藏可以通过加载和将实际图像数据重新保存到新文件中......其他图像将更加困难。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果有人通过POST或GET来告诉php页面怎么说? - How do I tell for a php page if someone came by POST or GET? 如何分辨不带扩展名的图像文件类型? 它们如何工作? - How can I tell the filetype of images without extension? How do they work? 来自服务器端的 GA4 自定义事件,有人可以告诉我如何在 php 中执行以下代码吗? - GA4 custom event from server side, can someone tell me how i can do the following code in php? 使用后如何告诉PHP摆脱load_file数据? - How do I tell PHP to get rid of load_file data after it's used? 301重定向PHP:我是否需要明确告诉它它是301? - 301 Redirects in PHP: Do I need to explicitly tell it that it's a 301? 如何告诉 PHP 的日期时间解析器我想要“下一个”而不是“当前年份”的日期? - How do I tell PHP's datetime parser that I want the "next" rather than necessarily "the current year's" date? Google 登录 API - 如何使用 PHP 注销某人? - Google Sign in API - How do I log someone out with PHP? 有人可以告诉我如何在PHP中转义“&”符号吗? - Can someone please tell me how to escape an ampersand in PHP? 有人能告诉我 DateTimeZone::getOffset PHP function 是如何工作的吗? - Can someone tell me how the DateTimeZone::getOffset PHP function works? 如何动态设置数据:图像/文件类型? - How do I set data:image/ filetype dynamically?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM