简体   繁体   中英

simple_html_dom.php

I am using " simple_html_dom.php " to scrap the data from the Wikipedia site. If I run the code in scraperwiki.com it's throwing an error as exit status 139 and if run the same code in my xampp sever, the server is hanging.

  1. I have a set of links
  2. I'm trying to get Literacy value from all the sites
  3. If I run the code with one link there is no problem and it's returning the expected result
  4. If I try to get data from all the sites in one go I'm facing the above problem

The code is:

<?php 
  $test=array
  ( 
   0 => "http://en.wikipedia.org/wiki/Andhra_Pradesh",
   1 => "http://en.wikipedia.org/wiki/Arunachal_Pradesh",
   2 => "http://en.wikipedia.org/wiki/Assam",
   3 => "http://en.wikipedia.org/wiki/Bihar",
   4 => "http://en.wikipedia.org/wiki/Chhattisgarh",
   5 => "http://en.wikipedia.org/wiki/Goa",

   for($ix=0;$ix<=9;$ix++){

     $content = file_get_html($test[$ix]);
     $tables = $content ->find('#mw-content-text table',0);
     foreach ($tables ->children() as $child1) {
        foreach($child1->find('th a') as $ele){
        if($ele->innertext=="Literacy"){
                foreach($child1->find('td') as $ele1){
                   echo $ele1->innertext;
   }}}  }} 

Guide me where am wrong. Is there any memory problem??? Is there any xampp configuration???

<?php 
  require 'simple_html_dom.php';
  $test = array( 
   0 => "http://en.wikipedia.org/wiki/Andhra_Pradesh",
   1 => "http://en.wikipedia.org/wiki/Arunachal_Pradesh",
   2 => "http://en.wikipedia.org/wiki/Assam",
   3 => "http://en.wikipedia.org/wiki/Bihar",
   4 => "http://en.wikipedia.org/wiki/Chhattisgarh",
   5 => "http://en.wikipedia.org/wiki/Goa");

  for($ix=0;$ix<=count($test);$ix++){
    $content = file_get_html($test[$ix]);
    $tables = $content ->find('#mw-content-text table',0);
    foreach ($tables ->children() as $child1) {
      foreach($child1->find('th a') as $ele){
        if($ele->innertext=="Literacy"){
          foreach($child1->find('td') as $ele1){
            echo $ele1->innertext;
          }
        }
      }
    }   
    $content->clear(); 
  }
?>

but these URLs are too much. You may get a fatal error of max execution time execeeded or you may get error 324 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM