简体   繁体   English

PHP&do / while循环内存泄漏

[英]PHP & do/while loop memory leaking

I have a do/while loop that goes over database rows. 我有一个do / while循环遍历数据库行。 Because it runs many days at the time processing 100000s of rows, memory consumption is important to keep in check or it will crash. 因为它在处理100000行时运行了很多天,所以内存消耗对于保持检查很重要,否则会崩溃。 Right now every iteration adds about 4kb to script's memory usage. 现在每次迭代都会增加大约4kb的脚本内存使用量。 I'm using memory_get_usage() to monitor the usage. 我正在使用memory_get_usage()来监控使用情况。

I unset every variable used in the loop first thing in each iteration so I really don't know what else I could do. 我在每次迭代中首先取消了循环中使用的每个变量,所以我真的不知道我还能做什么。 My guess is that do/while gathers some data with each iteration and this is what consumes the 4kb of memory. 我的猜测是,每次迭代时都会收集一些数据,这就是消耗4kb的内存。 I know 4kb doesn't sound like much but it soon starts to add up when you have 100000s of iterations. 我知道4kb听起来并不多,但是当你有100000次迭代时它很快就会加起来。

Can somebody suggest another way of going through large amount of database rows or how to somehow eliminate this "memory leak"? 有人可以建议另一种方法来处理大量数据库行或如何以某种方式消除这种“内存泄漏”?

edit Here's the UPDATED loop code. 编辑这是UPDATED循环代码。 Above it is just few require_once()s. 上面只有几个require_once()s。

$URLs = new URLs_url(db());
$c = new Curl;
$c->headers = 1;
$c->timeout = 60;
$c->getinfo = true;
$c->follow = 0;
$c->save_cookies = false;

do {
    // Get url that hasn't been checked for a week
    $urls = null;

    // Check week old
    $urls = $URLs->all($where)->limit(10);

    foreach($urls as $url) {
        #echo date("d/m/Y h:i").' | Checking '.$url->url.' | db http_code: '.$url->http_code;

        // Get http code    
        $c->url = $url->url;
        $data = $c->get();

        #echo ' - new http_code: '.$data['http_code'];

        // Save info
        $url->http_code = $data['http_code'];
        $url->lastchecked = time();
        $URLs->save($url);
        $url = null;
        #unset($c);
        $data = null;
        #echo "\n".memory_get_usage().' | ';
        echo "\nInner loop memory usage: ".memory_get_usage();
    }
    echo "\nOuter loop memory usage: ".memory_get_usage();

} while($urls);

Some logs how memory consumption behaves in both loops: 一些记录两个循环中内存消耗的行为:

Inner loop memory usage: 611080
Inner loop memory usage: 612452
Inner loop memory usage: 613788
Inner loop memory usage: 615124
Inner loop memory usage: 616460
Inner loop memory usage: 617796
Inner loop memory usage: 619132
Inner loop memory usage: 620500
Inner loop memory usage: 621836
Inner loop memory usage: 623172
Outer loop memory usage: 545240
Inner loop memory usage: 630680
Inner loop memory usage: 632016
Inner loop memory usage: 633352
Inner loop memory usage: 634688
Inner loop memory usage: 636088
Inner loop memory usage: 637424
Inner loop memory usage: 638760
Inner loop memory usage: 640096
Inner loop memory usage: 641432
Inner loop memory usage: 642768
Outer loop memory usage: 556392
Inner loop memory usage: 640416
Inner loop memory usage: 641752
Inner loop memory usage: 643088
Inner loop memory usage: 644424
Inner loop memory usage: 645760
Inner loop memory usage: 647096
Inner loop memory usage: 648432
Inner loop memory usage: 649768
Inner loop memory usage: 651104
Inner loop memory usage: 652568
Outer loop memory usage: 567608
Inner loop memory usage: 645924
Inner loop memory usage: 647260
Inner loop memory usage: 648596
Inner loop memory usage: 649932
Inner loop memory usage: 651268
Inner loop memory usage: 652604
Inner loop memory usage: 653940
Inner loop memory usage: 655276
Inner loop memory usage: 656624
Inner loop memory usage: 657960
Outer loop memory usage: 578732

This bit should probably happen only once, before the loop: 这个位应该只在循环之前发生一次:

$c = new Curl;
$c->headers = 1;
$c->timeout = 60;
...
$c->getinfo = true;
$c->follow = 0;
$c->save_cookies = false;

Edit: Oh, the entire thing is wrapped in a do/while loop. 编辑:哦,整个事情都包含在do / while循环中。 /facepalm /捂脸

Edit 2: There's also this important bit: 编辑2:还有这个重要的一点:

unset($class_object) does not release resources allocated by the object. unset($ class_object)不释放该对象分配的资源。 If used in loops, which create and destroy objects, that might easily lead to a resource problem. 如果在创建和销毁对象的循环中使用,则可能容易导致资源问题。 Explicitly call the destructor to circumvent the problem. 明确地调用析构函数来规避问题。

http://www.php.net/manual/en/function.unset.php#98692 http://www.php.net/manual/en/function.unset.php#98692

Edit 3: 编辑3:

What is this? 这是什么? Can't this be moved outside of the loop somehow? 不能以某种方式将它移到循环之外吗?

$URLs = new URLs_url(db());

Edit 4: 编辑4:

Try removing these lines, for now. 暂时尝试删除这些行。

    $url->http_code = $data['http_code'];
    $url->lastchecked = time();
    $URLs->save($url);

I think your core problem is that you're only clearing things in the outer loop. 我认为你的核心问题是你只是在外循环中清除东西。

$c = new Curl for instance is going to allocate memory to the heap for each iteration of the inner loop, but you're only unset ing the last instance. $c = new Curl例如将为内循环的每次迭代分配内存到堆,但是你只是unset最后一个实例。 I'd unset any stuff you can ( $c , $data ) at the end of the inner loop. 我会在内循环结束unset你可以的任何东西( $c$data )。

The problem is probably 问题可能是

$c = new Curl

Is it possible to instantiate Curl once outside the loop, and then inside keep reusing the same instance. 是否可以在循环外部实例化Curl,然后在内部继续重用相同的实例。 You could reset all fields to null in the loop if you wanted. 如果需要,可以在循环中将所有字段重置为null。

I had a similar problem. 我遇到了类似的问题。 Unset didn't work - it turned out the garbage collection was rubbish. 取消设置无效 - 事实证明垃圾收集是垃圾。 When I reused objects, it was fine (well, it broke for different reasons so I ended up reimplementing in Java). 当我重用对象时,它很好(好吧,它因各种原因而破坏,所以我最终在Java中重新实现)。

This may or may not help you, but way back when in 2000, I had a client who had really slow internet and wanted to do all his website cms updates locally and update to live when done. 这可能会或可能不会对你有所帮助,但回到2000年,我有一个客户,他的互联网真的很慢,想在本地完成他所有的网站cms更新,并在完成后更新为live。 Back then on IIS on win xp, I could not find a way of increasing script timeout from 60 seconds, and would generally need a good 2 minutes to do the update, so it would obviously time out. 当时在win xp上的IIS上,我找不到从60秒增加脚本超时的方法,并且通常需要2分钟才能完成更新,因此显然会超时。

To solve this, I would have the script update a set number of rows which were guaranteed to safely execute in under a minute, then call itself with a parameter of where to continue from, and so on until all rows were updated. 为了解决这个问题,我会让脚本更新一定数量的行,这些行保证在一分钟内安全执行,然后使用参数从哪里继续调用,依此类推,直到所有行都更新为止。 Maybe you could try something similar for your situation? 也许你可以尝试类似的情况吗?

Maybe run it for a set amount of time before calling itself, or in your case, maybe check memory, and redirect when usage gets too high? 也许在调用之前运行它一段时间,或者在你的情况下,可能检查内存,并在使用率过高时重定向?

I used something like this: 我使用过这样的东西:

Top of script: 脚本顶部:

$started = microtime(true);

Then this in your loop: 然后在你的循环中:

if((microtime(true)-$started) > ($seconds_to_redirect)) {
    //call script with parameter
}

This is all I can think of. 这是我能想到的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM