简体   繁体   中英

In PHP, how to decompress a file on the fly that was compressed twice?

I have a bigfile.gz.gz file that is… big. I would like to uncompress it on the fly. Ideally, this is what I have in mind:

$in = fopen('compress.zlib://compress.zlib://bigfile.gz.gz', 'rb');
while (!feof($in))
    print fread($in, 4096);
fclose($in);

However, compress.zlib:// cannot be chained that way:

PHP Warning:  fopen(): cannot represent a stream of type ZLIB as a File Descriptor in gztest.php on line 1

So I thought I'd combine gzopen() and compress.zlib:// together:

$in = gzopen('compress.zlib://bigfile.gz.gz', 'rb');
while (!gzeof($in))
    print gzread($in, 4096);
gzclose($in);

However, this only decompresses one level of gzip.

I tried probably 10 other methods, unfortunately gzopen() does not work with php://memory if it's been written to using fwrite() . And stream_filter_append(… zlib.inflate …) cannot read gzipped files.

This is the best I could come up with, but it spawns two system processes, which has undesirable overhead:

$in = popen('zcat bigfile.gz.gz | gunzip', 'rb');
while (!feof($in))
    print fread($in, 4096);
fclose($in);

Can someone suggest something better maybe?

It's possible to uncompress .gz files using the zlib.inflate filter. You just need to strip out the gzip header first. To do that on the fly, you have to deploy a custom filter:

<?php

class gzip_header_filter extends php_user_filter {

    private $filtered = 0;

    public function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            if($this->filtered == 0) {
                $header_len = 10;
                $header = substr($bucket->data, 0, 10);
                $flags = ord($header[3]);
                if($flags & 0x08) {
                    // a filename is present
                    $header_len = strpos($bucket->data, "\0", 10) + 1;
                } 
                $bucket->data = substr($bucket->data, $header_len);
                $this->filtered = $header_len;
            }
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register('gzip_header_filter', 'gzip_header_filter');

$in = fopen('bigfile.gz.gz', 'rb');
stream_filter_append($in, 'gzip_header_filter', STREAM_FILTER_READ);
stream_filter_append($in, 'zlib.inflate', STREAM_FILTER_READ);
stream_filter_append($in, 'gzip_header_filter', STREAM_FILTER_READ);
stream_filter_append($in, 'zlib.inflate', STREAM_FILTER_READ);

while (!feof($in))
    print fread($in, 4096);
fclose($in);

?>

Note that the code above doesn't handle comments and other extra data that could be stored in the gz file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM