简体   繁体   中英

simple_html_dom.php memory problems

I am trying to write a crawler using simple_html_dom.php version 1.5 but it seems it leaks memory for reasons unknown. I tried the 1.5 because they claim to have fixed memory leaks help will be appreciated. after 40 repetitions of the loop i get the following message

   Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 8388608 bytes) in C:\work\simple_html_dom.php on line 1078
<?php
/**
* ******************TESTING*************************
*/




include("simple_html_dom.php");


$beginning=0;
$end=35;
$FileName = "c:/results.txt";
$FileHandle = fopen($FileName, 'w') or die("can't open file");

for ($i = $beginning; $i < $end; $i++) {

$url = sprintf('http://imgur.com/gallery/hot/day/page/%d?scrolled',$i);

$html = file_get_html($url);

echo "Day: -".$i."\n";


foreach($html -> find('div[class=posts]') as $element){




    foreach($element -> find('img') as $el)
    {
        $urls = $el-> src;
        $urls1 = str_replace('b.jpg','.jpg',$el->src);
        $urls2 =     str_replace('.jpg','',str_replace('.com/','.com/gallery/',str_replace('http://i.','http://',str_replace('b.jpg','.jpg',$el->src))));

        $title=str_replace('&quot;','"',str_replace('&#039;',"'",stristr($el-> title,'<p>',true)));
        $fil= $urls2.'             '.$urls.'             '.$urls1.'             '.$title."\n";
        fwrite($FileHandle, $fil);

    }
}

$html->clear;
unset($html);
}

fclose($FileHandle);




?>
$html->clear;

if this is your actual code then you may want to change it to function call: $html->clear();

If its not the issue, try downgrading to 1.11, clear() worked there pretty well.

You could increase the memory with

ini_set("memory_limit","LIMIT"); 

for example to

ini_set("memory_limit","32M");

btw, check out: PHP Simple HTML Dom Memory Issue

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM