简体   繁体   中英

Reduce the load php cURL puts on the server

I'm currently using php URL to browse over 500 web pages a day with cookies.

I have to check each page to ensure that the account is still logged in and the pages are being viewed as a member, not a guest.

The script takes an hour or two to complete as it sleeps in between views.

I just want to know if there's anything I can do to reduce the load this script puts on the local server, I've made sure to clear variables at the end of each loop but is there anything I'm missing that would help?

Any new cURL settings that would help?

$i = 0;
$useragents = array();

foreach($urls as $url){

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_COOKIEJAR, str_replace('\\','/',dirname(__FILE__)).'/cookies.txt');
    curl_setopt($ch, CURLOPT_COOKIEFILE, str_replace('\\','/',dirname(__FILE__)).'/cookies.txt');
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, $useragents[array_rand($useragents)]);

    $html = curl_exec($ch);
    curl_close($ch);

    if(!$html)
        die("No HTML - Not logged in");

    if($i %10 != 0)
        sleep(rand(5,20));
    else
        sleep(rand(rand(60,180), rand(300,660)));

    $i++;

    $html = '';
}

You could reuse your curl handle instead of creating a new one for each connection.

Clearing $html at the end of each iteration won't reduce memory usage and just adds an extra operation because it already gets reset in the next iteration.

$i = 0;
$useragents = array();
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, str_replace('\\','/',dirname(__FILE__)).'/cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, str_replace('\\','/',dirname(__FILE__)).'/cookies.txt');

foreach($urls as $url){

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_USERAGENT, $useragents[array_rand($useragents)]);
    $html = curl_exec($ch);

    if(!$html)
        die("No HTML - Not logged in");

    if($i++ % 10 != 0)
        sleep(rand(5,20));
    else
        sleep(rand(rand(60,180), rand(300,660)));  
}

curl_close($ch);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM