简体   繁体   中英

PHP cUrl loop leaking memory

The following code is in a loop. Each loop changes URL to a new address. My problem is that each pass takes up more and more memory.

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://site.ru/');
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, 'http://site.ru/');
curl_setopt($ch, CURLOPT_HEADER, false);

$html = new \DOMDocument();
$html->loadHTML(curl_exec($ch));

curl_close($ch);
$ch = null;

$xpath = new \DOMXPath($html);
$html = null;

foreach ($xpath->query('//*[@id="tree"]/li[position() > 5]') as $category) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $xpath->query('./a', $category)->item(0)->nodeValue);
    curl_setopt($ch, CURLOPT_TIMEOUT, 60);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, 'http://site.ru/');
    curl_setopt($ch, CURLOPT_HEADER, false);

    $html = new \DOMDocument();
    $html->loadHTML(curl_exec($ch));

    curl_close($ch);
    $ch = null;

    // etc.
}

The memory is 2000 Mb. Script execution time ~ 2h. PHP version 5.4.4. How to avoid memory leak? Thanks!

Stories from the internet indicate that curl_setopt($ch, CURLOPT_RETURNTRANSFER, true) is broken in for some PHP/cURL versions:

You can also find stories for DOM :

Create a minimal test case which spots the cause of the leak. Ie remove the unrelated package (DOM or cURL) from the code.

Then reproduce it with the latest PHP version. If it's still causing the leak, file a bug report else use that PHP version.

Reuse the same curl handle instead of creating and destroying it each time in your loop.

$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, 'http://site.ru/');
curl_setopt($ch, CURLOPT_HEADER, false);
foreach ($pages as $url) {
    curl_setopt($ch, CURLOPT_URL, $url);
    $response = curl_exec($ch);
}
curl_close($ch);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM