简体   繁体   中英

Crawling with PHP and XPATH

I'm crawling a page because I want to show it on our website. I have a problem with getting the link on each team though. I get the team name, but I cant get the href attribute.

My code looks like this:

elements = $xpath->query("//table/tr[contains(@class,'sr')]/td[contains(@class,'c')]");

$count = 0;
foreach ($elements as $elt) {
  if($count == 0)
  {
    $stringInsert = utf8_decode($elt->textContent);
  }
  else if($count == 1)
  {
             // tries to echo the href here, but dont get it.
             echo $elt->getAttribute('href')

     $stringInsert .= ", '".trim(utf8_decode($elt->textContent))."'";
  }
  else if($count == 3)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }
  else if($count == 4)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }
  else if($count == 5)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }
  else if($count == 6)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }
  else if($count == 7)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }
  else if($count == 9)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }
  else if($count == 10)
  {
     $stringInsert .= ", ".utf8_decode($elt->textContent);
  }

       $count++;

   if($count == 12)
   {
       echo $stringInsert;
       $count = 0;
   }

  }

As you can see in the code, i try to echo the $elt->getAttribute('href') in count == 1, but it does not show anything.

I have tried to add a /a to the xpath conditions, but then it only gets the Team name and not all the other stuff like, score, point and etc.

You seem to query for the td Elements, which won't have an attribute href.

May this example is helpful:

//array to store the results
$res = array();

//loop over all <tr> elements of the table.srPoolPosition
foreach ($path->query("//table[contains(@class,'srPoolPosition')]/tr") as $row) {

    //new array to store results in each row
    $rowRes = array();

    //get the <td> elements in current <tr>
    $fields = $path->query('td', $row);
    //skip if not 12 fields
    if ($fields->length < 12) {
        continue;
    }
    //loop over those
    foreach ($fields as $field) {
        //store the textcontent in the current rows array
        $rowRes[] = utf8_decode($field->textContent);
    }

    //query for the link in the current row
    $link = $path->query("a", $row)->item(0)->getAttribute('href');
    //add the link to the results array
    rowRes[] = $link;

    //then add it to the results
    $res[] = $rowRes;
}

//example loop over the results
foreach ($res as $tableRow) {
    echo sprintf(
         '<a href="%s">%s</a>: %s - %s<br>', 
          $tableRow[13],  //link href
          $tableRow[1],   //name
          $tableRow[7],   //score 1
          $tableRow[9]    //score 2
    );
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM