简体   繁体   English

PHP:fseek()用于大文件(> 2GB)

[英]PHP: fseek() for large file (>2GB)

I have a very large file (about 20GB), how can I use fseek() to jump around and read its content. 我有一个非常大的文件(大约20GB),我如何使用fseek()跳转并阅读其内容。

The code looks like this: 代码如下所示:

function read_bytes($f, $offset, $length) {
    fseek($f, $offset);
    return fread($f, $length);
}

The result is only correct if $offset < 2147483647. 结果只有在$ offset <2147483647时才正确。

Update: I am running on windows 64, phpinfo - Architecture: x64, PHP_INT_MAX: 2147483647 更新:我在Windows 64上运行,phpinfo - 架构:x64,PHP_INT_MAX:2147483647

WARNING: as noted in comments, fseek uses INT internally and it simply cant work with such large files on 32bit PHP compilations. 警告:如评论中所述,fseek在内部使用INT,它无法在32位PHP编译中使用如此大的文件。 Following solution wont work. 以下解决方案不会工作。 It is left here just for reference. 它留在这里仅供参考。

a little bit of searching led me to comments on PHP manual page for fseek: 一点点的搜索引导我对fseek的PHP手册页进行评论:

http://php.net/manual/en/function.fseek.php http://php.net/manual/en/function.fseek.php

problem is maximum int size for offset parameter but seems that you can work around it by doing multiple fseek calls with SEEK_CUR option and mix it with one of big numbers processing library. 问题是偏移参数的最大int大小,但似乎你可以通过使用SEEK_CUR选项执行多个fseek调用并将其与大数字处理库之一混合来解决它。

example: 例:

function fseek64(&$fh, $offset)
{
    fseek($fh, 0, SEEK_SET);
    $t_offset   = '' . PHP_INT_MAX;
    while (gmp_cmp($offset, $t_offset) == 1)
    {
        $offset     = gmp_sub($offset, $t_offset);
        fseek($fh, gmp_intval($t_offset), SEEK_CUR);
    }
    return fseek($fh, gmp_intval($offset), SEEK_CUR);
}

fseek64($f, '23456781232');

for my project, i needed to READ blocks of 10KB from a BIG offset in a BIG file (>3 GB). 对于我的项目,我需要从BIG文件(> 3 GB)中的BIG偏移读取10KB的块。 Writes were always append, so no offsets needed. 写入总是附加,因此不需要补偿。

this will work, irrespective of which PHP version and OS you are using. 无论您使用哪种PHP版本和操作系统,这都可以使用。

Pre-requisite = your server should support Range-retrieval queries. 先决条件=您的服务器应支持范围检索查询。 Apache & IIS already support this, as do 99% of other webservers (shared hosting or otherwise) Apache和IIS已经支持这一点,99%的其他Web服务器(共享托管或其他)也支持此功能

// offset, 3GB+
$start=floatval(3355902253);

// bytes to read, 100 KB
$len=floatval(100*1024);

// set up the http byte range headers
$opts = array('http'=>array('method'=>'GET','header'=>"Range: bytes=$start-".($start+$len-1)));
$context = stream_context_create($opts);
// bytes ranges header
print_r($opts);

// change the URL below to the URL of your file. DO NOT change it to a file path.
// you MUST use a http:// URL for your file for a http request to work
// this will output the results
echo $result = file_get_contents('http://127.0.0.1/dir/mydbfile.dat', false, $context);

// status of your request
// if this is empty, means http request didnt fire. 
print_r($http_response_header);

// Check your file URL and verify by going directly to your file URL from a web 
// browser. If http response shows errors i.e. code > 400 check you are sending the
// correct Range headers bytes. For eg - if you give a start Range which exceeds the
// current file size, it will give 406. 

// NOTE  - The current file size is also returned back in the http response header
// Content-Range: bytes 355902253-355903252/355904253, the last number is the file size

...

... ...

... ...

SECURITY - you must add a .htaccess rule which denies all requests for this database file except those coming from local ip 127.0.0.1. 安全性 - 您必须添加.htaccess规则,该规则拒绝对此数据库文件的所有请求,但来自本地IP 127.0.0.1的请求除外。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM