简体   繁体   English

用尽内存不足

[英]Running Out Of Memory With Fread

I'm using Backblaze B2 to store files and am using their documentation code to upload via their API. 我正在使用Backblaze B2存储文件,并使用其文档代码通过其API上传。 However their code uses fread to read the file, which is causing issues for files that are larger than 100MB as it tries to load the entire file into memory. 但是,他们的代码使用fread读取文件,这会导致大于100MB的文件出现问题,因为它试图将整个文件加载到内存中。 Is there a better way to this that doesn't try to load the entire file into RAM? 有没有一种更好的方法可以避免将整个文件加载到RAM中?

$file_name = "file.txt";
$my_file = "<path-to-file>" . $file_name;
$handle = fopen($my_file, 'r');
$read_file = fread($handle,filesize($my_file));

$upload_url = ""; // Provided by b2_get_upload_url
$upload_auth_token = ""; // Provided by b2_get_upload_url
$bucket_id = "";  // The ID of the bucket
$content_type = "text/plain";
$sha1_of_file_data = sha1_file($my_file);

$session = curl_init($upload_url);

// Add read file as post field
curl_setopt($session, CURLOPT_POSTFIELDS, $read_file); 

// Add headers
$headers = array();
$headers[] = "Authorization: " . $upload_auth_token;
$headers[] = "X-Bz-File-Name: " . $file_name;
$headers[] = "Content-Type: " . $content_type;
$headers[] = "X-Bz-Content-Sha1: " . $sha1_of_file_data;
curl_setopt($session, CURLOPT_HTTPHEADER, $headers); 

curl_setopt($session, CURLOPT_POST, true); // HTTP POST
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);  // Receive server response
$server_output = curl_exec($session); // Let's do this!
curl_close ($session); // Clean up
echo ($server_output); // Tell me about the rabbits, George!

I have tried using: 我试过使用:

curl_setopt($session, CURLOPT_POSTFIELDS, array('file' => '@'.realpath('file.txt')));

However I get an error response: Error reading uploaded data: SocketTimeoutException(Read timed out) 但是我收到一个错误响应:读取上传的数据时出错:SocketTimeoutException(读取超时)

Edit: Streaming the filename withing the CURL also doesn't seem to work. 编辑:使用CURL流式传输文件名似乎也不起作用。

The issue you are having is related to this. 您遇到的问题与此有关。

fread($handle,filesize($my_file));

With the filesize in there you might as well just do file_get_contents . 有了filesize之后,您也可以执行file_get_contents it's much better memory wise to read 1 line at a time with fget fget一次读取1行是更好的内存选择

$handle = fopen($myfile, 'r');

while(!feof($handle)){
     $line = fgets($handle);
} 

This way you only read one line into memory, but if you need the full file contents you will still hit a bottleneck. 这样,您仅将一行读入内存,但是如果您需要完整的文件内容,仍然会遇到瓶颈。

The only real way is to stream the upload. 唯一真正的方法是流式上传。

I did a quick search and it seems the default for CURL is to stream the file if you give it the filename 我进行了快速搜索,似乎CURL的默认值是如果提供文件名,则流文件

 $post_data['file'] = 'myfile.csv';

 curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);

You can see the previous answer for more details 您可以查看先前的答案以获取更多详细信息

Is it possible to use cURL to stream upload a file using POST? 是否可以使用cURL通过POST流式上传文件?

So as long as you can get past the sha1_file It looks like you can just stream the file, which should avoid the memory issues. 因此,只要您可以超过sha1_file ,您就可以流式传输文件,这应该可以避免内存问题。 There may be issues with time limit though. 但是可能存在时间限制问题。 Also I can't really think of a way around getting the hash if that fails. 另外,如果失败,我真的想不出一种解决哈希的方法。

Just FYI, personally I never tried this, typically i just us sFTP for large file transfers. 仅供参考,我个人从未尝试过此操作,通常我只是使用sFTP进行大型文件传输。 So I don't know if it has to be specially post_data['file'] I just copied that from the other answer. 所以我不知道它是否必须是特殊的post_data['file']我只是从另一个答案中复制而来。

Good luck... 祝好运...

UPDATE UPDATE

Seeing as streaming seems to have failed (see comments). 串流似乎失败了(请参阅评论)。

You may want to test the streaming to make sure it works. 您可能需要测试流媒体以确保其正常工作。 I don't know what all that would involve, maybe stream a file to your own server? 我不知道会涉及什么,也许将文件流式传输到您自己的服务器? Also I am not sure why it wouldn't work "as advertised" and you may have tested it already. 另外,我不确定为什么它不能按“广告宣传”的方式工作,您可能已经对其进行了测试。 But it never hurts to test something, never assume something works until you know for sure. 但是测试直到确定都不会伤害任何东西,永远不会假设某件事会起作用。 It very easy to try something new as a solution, only to miss a setting or put a path in wrong and then fall back to thinking its all based on the original issue. 尝试新的解决方案非常容易,只是错过设置或错误设置路径,然后又回过头去考虑所有问题。

I've spent a lot of time tearing things apart only to realize I had a spelling error. 我花了很多时间将事情拆散,才意识到我遇到了拼写错误。 I'm pretty adept a programing these days so I typically overthink the errors too. 这些天我很擅长编程,所以我通常也会过分考虑错误。 My point is, be sure it's not a simple mistake before moving on. 我的意思是,在继续之前,请确保这不是一个简单的错误。

Assuming everything is setup right, I would try file_get_contents . 假设一切设置正确,我将尝试file_get_contents I don't know if it will be any better but it's more meant to open whole files. 我不知道它是否会更好,但是它更适合打开整个文件。 It also would seem to be more Readable in the code, because then it's clear that the whole file is needed. 它在代码中似乎也更可读,因为这样很显然需要整个文件。 It just seems more semantically correct if nothing else. 如果没有别的,在语义上似乎更正确。

You can also increase the RAM PHP has access to by using 您还可以使用以下方法增加PHP可以访问的RAM

ini_set('memory_limit', '512M')

You can even go higher then that, depending on your server. 根据您的服务器,您甚至可以提高到更高水平。 The highest I went before was 3G , but the server I uses has 54GB of ram and that was a one time thing, (we migrated 130million rows from MySql to MongoDB, the innodb index was eating up 30+GB ). 我之前最高的是3G ,但是我使用的服务器有54GB的内存,这是一次的事情(我们从MySql迁移了1.3亿行到MongoDB,innodb索引占用了30 + GB)。 Typically I run with 512M and have some scripts that routinely need 1G . 通常,我使用512M运行,并且有一些脚本通常需要1G But I wouldn't just up the Memory willy-nilly. 但是,我不会只是随意地增加记忆力。 That is usually a last resort for me after optimizing and testing. 经过优化和测试后,通常这对我来说是不得已的选择。 We do a lot of heavy processing that is why we have such a big server, we also have 2 slave servers (among other things) that run with 16GB each. 我们进行了大量繁重的处理,这就是为什么我们有这么大的服务器的原因,我们还有2个从服务器(除其他外),每个服务器都运行16GB。

As far as what size to put, typically I increment it by 128M tell it works, then add an extra 128M just to be sure, but you might want to go in smaller steps. 至于放什么尺寸,通常我增加它通过128M告诉它的工作原理,然后添加一个额外的128M只是要确定,但你可能想在较小的步骤去。 Typically people always use multiples of 8, but I don't know if that make to much difference these days. 通常,人们总是使用8的倍数,但是我不知道这些天是否有很大的不同。

Again, Good Luck. 再次祝你好运。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM