繁体   English   中英

如何使用cURL在“ <a>文本”</a>标签中获取文本?

[英]How do I get the text inside of the <a> This Text </a> tags using cURL?

我收到此错误“致命错误:使用此代码调用未定义的方法DOMText :: getAttribute()”。 我想捕获链接的文本而不是源的链接(我不知道它叫什么)。有人可以向我解释我的错误或告诉我另一种方式吗? 这是我的代码:

<?php

$target_url = "SITE I WANT";
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

// make the cURL request to $target_url
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html= curl_exec($ch);
if (!$html) {
    echo "<br />cURL error number:" .curl_errno($ch);
    echo "<br />cURL error:" . curl_error($ch);
    exit;
}

// parse the html into a DOMDocument
$dom = new DOMDocument();
@$dom->loadHTML($html);

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a/text()");

for ($i = 0; $i < $hrefs->length; $i++) {
    $href = $hrefs->item($i);
    $url = $href->getAttribute('href');
    storeLink($url,$target_url);
    echo "<br />Link stored: $url";
}
$id = "12";
   $query = "DELETE FROM links WHERE id<=$id";
    if(!mysql_query($query))
        echo "DELETE failed: $query<br />" . 
        mysql_error() . "<br /><br />";
        ?>

你去了:

$document = new DOMDocument();
$document->loadHTML($html);
$selector = new DOMXPath($document);
$anchors = $selector->query('/html/body//a');

foreach($anchors as $a) { 
    $text = $a->nodeValue;
    $href = $a->getAttribute('href');
    echo($text . ' : ' . $href . '<br />');

}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM