簡體   English   中英

使用 xpath 從頁面上的所有 H2 標記中獲取 H2 文本和 href 值?

[英]Get H2 text and href values from inside all H2 tags on the page using xpath?

我對 xpath 或 DOM 一無所知,零。

最后,我需要頁面上 12 個 H2 標記的 href 值和 span 的內容。 我已經想出了如何單獨獲取每個項目,但是無論我讀了多少,一口氣把它們全部拿走並不是點擊。 一點幫助?

<h2 class="make-it-pretty">
    <a class="more-pretty" href="some-file-somewhere">
        <span class="another-class">Product Name</span>
    </a>
</h2>

這是我用來單獨獲取它們的方法。

    $doc = new DOMDocument();
    $doc->loadHTML($html);
    $xpath = new DOMXPath($doc);

    $htext = $xpath->query('//h2[contains(@class, "make-it-pretty")]')->item(0);
    echo $htext->textContent;

我可能會改用$doc->loadHTMLFile ,但是:

<?php
$html = '<html lang="en"><head><meta charset="UTF-8" /><title>Title Here</title></head>
  <body>
    <h2 class="make-it-pretty"><a class="more-pretty" href="some-file-somewhere"><span class="another-class">Product Name</span></a></h2>
  </body></html>';
$doc = @new DOMDocument(); $doc->loadHTML($html);
function getElementsByClassName($className, $withinNode = null){
  global $doc;
  $d = $withinNode ?? $doc;
  $r = []; $a = $d->getElementsByTagName('*');
  foreach($a as $n){
    if($n->getAttribute('class') === $className)$r[] = $n;
  }
  return $r;
}
$anotherClass = getElementsByClassName('another-class');
// getElementsByClassName('make-it-pretty'); works as well, in this case
echo $anotherClass[0]->textContent;
?>

在沒有 Xpath 的情況下試試這個

<?
$html ='<h2 class="make-it-pretty"> <a class="more-pretty" href="some-file-somewhere"> <span class="another-class">Product Name</span> </a> </h2><h2 class="make-it-pretty"> <a class="more-pretty" href="some-file-somewhere"> <span class="another-class">Product Name</span> </a> </h2><h2 class="make-it-pretty"> <a class="more-pretty" href="some-file-somewhere"> <span class="another-class">Product Name</span> </a> </h2>';
$dom = new DOMDocument("1.0", "utf-8");
if($dom->loadHTML($html, LIBXML_NOWARNING)){
    $h2s = $dom->getElementsByTagName('h2');
    foreach ($h2s as $h2) {
        $as = $h2->getElementsByTagName('a');
        echo '<pre>';
        //print_r($as);
        foreach($as as $a){
            print_r('link :'.$a->getAttribute('href')."\n");
            $spans = $a->getElementsByTagName('span');
        }
        
        foreach($spans as $span){
            print_r('content :'.$span->nodeValue."\n");
            }
        
        
    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM