PHP header downloading unreadable zip file in Android

Question

My php script is converting a PDF file into a zip file containing images for each page of the PDF.

After loading the zip with images I'm transferring the zip to the headers like below.

ob_start();

header('Content-Transfer-Encoding: binary');
header('Content-disposition: attachment; filename="converted.ZIP"');
header('Content-type: application/octet-stream');

ob_end_clean();

readfile($tmp_file);
unlink($tmp_file);

exit();

The download is absolutely working fine in Windows, Linux and Mac.But when I'm requesting the same from an android device (normal browser or Chrome), an unreadable zip is being downloaded. On opening it through the file explorer it says "File is either corrupt or unsupported format" starting from Android 6 (not tested below this version).

I placed the ob_start() and ob_end_clean() function later even then it didn't work.

I checked many answers from stackoverflow but none of them working out like

What is the modification that is needed for android browsers?

<?php include 'headerHandlersCopy.php';
session_start();  
ob_start();
//echo session_id()."<br>";
?>

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <link rel="stylesheet" href="../css/handleConvertPDFtoJPG.css">
        <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css">


        <title>Compressing Image</title>
    </head>

    <body>
        <!-- Progress bar -->

        <div id="wrapper">
            <h1 id="head1">Compressing Image</h1>
            <h1 id="head2">Converting Image</h1>
            <div id="myProgress">
                <div id="myBar">10%</div>
            </div>
            <br>
        </div>
        
        <!-- end -->

        <?php 
      
            //code to display errors.
            ini_set('display_errors', 1);
            ini_set('display_startup_errors', 1);
            error_reporting(E_ALL); 
             
            if ($_SERVER['REQUEST_METHOD'] == 'POST'){

                $session_id = session_id();
                $uploadPath = "../upload/pdfUploads/"; 
                $pdfFileNameWithOutExt = basename($_FILES["pdfDoc"]["name"],"pdf");
                $dotRemovedFileNameTemp = str_replace(".", "", $pdfFileNameWithOutExt);
                $dotRemovedFileName = $session_id.$dotRemovedFileNameTemp;
                

                $imgExt = ".jpg";
                $fileNameLocationFormat = $uploadPath.$dotRemovedFileName.$imgExt;
                $fileNameLocation = $uploadPath.$dotRemovedFileName;
                $status = null;

                $imagick = new Imagick();

                # to get number of pages in the pdf to run loop below.
                # the below function generates unreadable images for each page.
                $imagick->pingImage($_FILES['pdfDoc']['tmp_name']);
                $noOfPagesInPDF = $imagick->getNumberImages();
                
                $imagick->readImage($_FILES['pdfDoc']['tmp_name']);
                $statusMsg = "test";

                # writing pdf into images.
                try {
                    $imagick->writeImages($fileNameLocationFormat, true);
                    $status = 1; 
                }
                catch(Exception $e) {
                    echo 'Message: ' .$e->getMessage();
                    $status = 0;
                }

                $files = array();

                # storing converted images into array.
                # only including the readable images into the
                $arrayEndIndex = ($noOfPagesInPDF * 2)-1;
                for ($x = $arrayEndIndex; $x >= $noOfPagesInPDF; $x--) {
                    array_push($files,"{$fileNameLocation}-{$x}.jpg" );
                }

                # create new zip object
                $zip = new ZipArchive();

                # create a temp file & open it
                $tmp_file = tempnam('.', '');
                $zip->open($tmp_file, ZipArchive::CREATE);

                # loop through each file
                foreach ($files as $file) {
                    # download file
                    $download_file = file_get_contents($file);

                    #add it to the zip
                    $zip->addFromString(basename($file), $download_file);
                }

                # close zip
                $zip->close();


                # file cleaning code
                # only those pdf files will be deleted which the current user uploaded.
                # we match the sesion id of the user and delte the files which contains the same session id in the file name.
                # file naming format is: session_id + destination + fileName + extension
                
                $files = glob("../upload/pdfUploads/{$session_id}*"); // get all file names
                foreach($files as $file){ // iterate files
                  if(is_file($file)) {
                    unlink($file); // delete file
                  }
                }

                // send the file to the browser as a download
                ob_end_clean();


                header('Content-Description: File Transfer');
                header('Content-type: application/octet-stream');
                header('Content-disposition: attachment; filename="geek.zip"');
                //header("Content-Length: " . filesize($tmp_file));
                header('Content-Transfer-Encoding: binary');
                header('Expires: 0');
                header('Cache-Control: must-revalidate');
                header('Pragma: public');
                flush();
                readfile($tmp_file);  
                unlink($tmp_file);      
                
                //filesize($tmp_file) causing the "error opening the file" when opening the zip even in PC browsers.
            }
        ?>

Answer 1

The issue appears to be caused by order of operations processing and including the HTML in the response to the client.

To circumvent the issues, I recommend using a separate script file for the POST request handler, as opposed to including it procedurally in the same view script. Otherwise, wrap the POST request processing in an if condition at the top of the script, ending it with exit to stop the response from continuing further.

This is partially causing the issue with the filesize() call, since the response size that includes the zip file and additional HTML differs from file size of only the zip file.

The below was tested in Google Chrome for Windows and Android 11.

# code to display errors
// USE ERROR REPORTING TO LOG FILES INSTEAD
# ini_set('display_errors', 1);
# ini_set('display_startup_errors', 1);
# error_reporting(E_ALL);
if (!session_id()) {
    // always ensure session is not already started before starting
    session_start();
}
if ('POST' === $_SERVER['REQUEST_METHOD'] &&
    !empty($_FILES['pdfDoc']) && // ensure files were uploaded
    UPLOAD_ERR_OK === $_FILES['pdfDoc']['error'] // ensure file uploaded without errors
) {
    $session_id = session_id();
    // removed redundant variable names
    // use absolute path with __DIR__ instead of relative
    $uploadSessionPath = $uploadPath = __DIR__ . '/../upload/pdfUploads/';
    $uploadSessionPath .= $session_id; // append session path
    // ensure upload directory exists
    if (!is_dir($uploadPath) && !mkdir($uploadPath, 0777, true) && !is_dir($uploadPath)) {
        throw new \RuntimeException(sprintf('Directory "%s" was not created', $uploadPath));
    }
    $fileNameLocation = $uploadSessionPath . str_replace('.', '', basename($_FILES['pdfDoc']['name'], 'pdf'));

    # convert pdf pages into images and save as JPG in the upload session path.
    try {
        $pdfDocFile = $_FILES['pdfDoc']['tmp_name'];
        $imagick = new Imagick();
        # get number of pages in the pdf to loop over images below.
        $imagick->pingImage($pdfDocFile);
        $noOfPagesInPDF = $imagick->getNumberImages();
        $imagick->setResolution(150, 150); // greatly improve image quality
        $imagick->readImage($pdfDocFile);
        $imagick->writeImages($fileNameLocation . '.jpg', true);
    } catch (Exception $e) {
        throw $e; //handle the exception properly - don't ignore it...
    }
    // ensure there are pages to zip
    if ($noOfPagesInPDF > 0) {
        // reduced to single iteration of files to reduce redundancy
        # create a temp file & open it
        $zipFile = tempnam(__DIR__, ''); // use absolute path instead of relative
        # create new zip object
        $zip = new ZipArchive();
        $zip->open($zipFile, ZipArchive::CREATE);
        # store converted images to zip file only including the readable images
        $arrayEndIndex = ($noOfPagesInPDF * 2) - 1;
        for ($x = $arrayEndIndex; $x >= $noOfPagesInPDF; $x--) {
            $file = sprintf('%s-%d.jpg', $fileNameLocation, $x);
            clearstatcache(false, $file); // ensure stat cache is clear
            // ensure file exists and is readable
            if (is_file($file) && is_readable($file)) {
                // use ZipArchive::addFile instead of ZipArchive::addFromString(file_get_contents) to reduce overhead
                $zip->addFile($file, basename($file));
            }
        }
        $zip->close();

        # file cleaning code
        # only those pdf files will be deleted which the current user uploaded.
        # we match the session id of the user and delete the files which contains the same session id in the file name.
        # file naming format is: session_id + destination + fileName + extension
        foreach (glob("$uploadSessionPath*") as $file) {
            clearstatcache(false, $file); // ensure stat cache is clear
            // ensure a file exists and can be deleted
            if (is_file($file) && is_writable($file)) {
                unlink($file);
            }
        }

        # send the file to the browser as a download
        if (is_file($zipFile) && is_readable($zipFile)) {
            header('Content-Description: File Transfer');
            header('Content-type: application/octet-stream');
            header('Content-disposition: attachment; filename="geek.zip"');
            header('Content-Length: ' . filesize($zipFile)); // Content-Length is a best-practice to ensure client receives the expected response, if it breaks the download - something went wrong
            header('Content-Transfer-Encoding: binary');
            header('Expires: 0');
            header('Cache-Control: must-revalidate');
            header('Pragma: public');
            readfile($zipFile);
            if (is_writable($zipFile)) {
                unlink($zipFile);
            }
            exit; // stop processing
        }
        // no pages in PDF were found - do something else
    }

   // file was not sent as a response - do something else
}

// use absolute path __DIR__ and always require dependencies to ensure they are included
// do not know what this contains...
require_once __DIR__ . '/headerHandlersCopy.php'; 
?>

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <link rel="stylesheet" href="../css/handleConvertPDFtoJPG.css">
        <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css">


        <title>Compressing Image</title>
    </head>

    <body>
        <!-- Progress bar -->

    <div id="wrapper">
        <h1 id="head1">Compressing Image</h1>
        <h1 id="head2">Converting Image</h1>
        <div id="myProgress">
            <div id="myBar">10%</div>
        </div>
        <br>
    </div>

Android File Manager Screenshot

Lastly as a general tip, do not use the PHP closing tag ?> to end PHP script context unless changing the context to non-PHP output like HTML or text. Otherwise, the line-break(s)/space(s) and other non-visible characters that exist after the ?> will be included in the response output, often-times causing unexpected results due to issues with include or causing corrupted responses like with file download data and redirects.

PHP only response

<?php 
// ...
echo 'PHP ends automatically without closing tag';

End response with PHP

<html>
</html>
<?php 

echo 'PHP ends automatically without closing tag';

Mixed response with PHP

<html>
<?php 

echo 'Mixed PHP continues as HTML';

?>
</html>

PHP header downloading unreadable zip file in Android

Question

1 answers

solution1
1 ACCPTED 2022-07-12 16:08:32

PHP header downloading unreadable zip file in Android

Question

1 answers

solution1 1 ACCPTED 2022-07-12 16:08:32

solution1
1 ACCPTED 2022-07-12 16:08:32