简体   繁体   中英

Detecting a mime type fails in php

I have the following PHP code that shows the mime type of an uploaded file.

<?php

if ($_POST) {

    var_dump($_FILES);

    $finfo = new finfo(FILEINFO_MIME_TYPE);

    var_dump($finfo->file($_FILES['file']['tmp_name']));

} else{
    ?>
    <form method="POST" enctype="multipart/form-data"><input name="file" type="file"><input name="submit" value="send" type="submit"/></form>
    <?php
}

The result of uploading somefile.csv with this script is as follows.

array (size=1)
    'file' =>
    array (size=5)
        'name' => string 'somefile.csv' (length=12)
        'type' => string 'text/csv' (length=8)
        'tmp_name' => string '/tmp/phpKiwqtu' (length=14)
        'error' => int 0
        'size' => int 3561
string 'text/x-fortran' (length=14)

So of course the mime type should be text/csv. But the framework I use (Symfony 1.4) uses the method with fileinfo.

Also I tested a little further it seems that the command (on Ubuntu) file --mime-type somefile.csv returns somefile.csv: text/x-fortran and the command mimetype somefile.csv returns somefile.csv: text/csv . somefile.csv is created with MSOffice (I don't know if this matters). Apparently mimetype uses some awesome mime database ( http://freedesktop.org/wiki/Software/shared-mime-info ), while file does not.

  1. Does PHP use file or mimetype or neither?
  2. Further, I am not sure what to do here; is my uploaded file wrongly formatted? Do I have to use a different mime database? Is PHP bugged? What is going on here?

edit:

The reason why it is detected as a fortran program is because somefile.csv contains only the following:

somecolumn;
C F;

I believe the above contents of a CSV file is valid right? If a field contains a space this field does not have to be put inside quotes, right?

I don't have a Unix box here to inspect a real "magic" file (the signatures database used to guess mime types) but a quick Google search revealed this:

# $File: fortran,v 1.6 2009/09/19 16:28:09 christos Exp $
# FORTRAN source
0       regex/100       \^[Cc][\ \t]    FORTRAN program
!:mime  text/x-fortran

Apparently, it scans the start of the file looking for lines that begin with a single C letter plus spaces, which seem to be a Fortran style comment . Thus the false positive:

somecolumn;
C F;

From PHP Mimetype introduction :

This extension has been deprecated as the PECL extension Fileinfo provides the same functionality (and more) in a much cleaner way.

The functions in this module try to guess the content type and encoding of a file by looking for certain magic byte sequences at specific positions within the file. While this is not a bullet proof approach the heuristics used do a very good job.

This extension is derived from Apache mod_mime_magic, which is itself based on the file command maintained by Ian F. Darwin. See the source code for further historic and copyright information.

From PHP Fileinfo introduction :

The functions in this module try to guess the content type and encoding of a file by looking for certain magic byte sequences at specific positions within the file. While this is not a bullet proof approach the heuristics used do a very good job .

Here's a question with some answers on the same subject: Detecting MIME type in PHP .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM