[英]Getting HREF Values from content on page
I am getting data from a page that is formatted like this 我正在从这样格式化的页面获取数据
<span id="RANDOMINFO">
<a href="/DEMO/RANDOMDATA">+</a>
<span title="1">DATA I WANT HERE</span>
<a href="https://URL.COM/">CLICK</a>
<a href="https://URL.COM/">MORE RANDOM DATA</a>
</span>
<span id="RANDOMINFO">
<a href="/DEMO/RANDOMDATA">+</a>
<span title="2">DATA I WANT HERE</span>
<a href="https://URL.COM/RANDOM">CLICK</a>
<a href="https://URL.COM/RANDOM">MORE RANDOM DATA</a>
</span>
How can I get the href value from the page 如何从页面获取href值
Here is the code I have to get the data from the span ID
but don't know how to do it for the href
as there is no name or id
这是我必须从span ID
获取数据的代码,但不知道如何为href
进行操作,因为no name or id
$doc = new DOMDocument();
@$doc->loadHTML($html2);
foreach($doc->getElementsByTagName('span') as $element )
{
if (!empty($element->attributes->getNamedItem('id')->value))
{
$filename = 'newpks/'.$f.'.txt';
$file = fopen($filename,"a");
$data = $element->attributes->getNamedItem('id')->value.PHP_EOL;
fwrite($file,$data);
fclose($file);
$i++;
$end = $start;
}
}
I assume you're only interested in links with the href
attribute, and then we know the tags will be of type a
. 我假设您只对带有href
属性的链接感兴趣,然后我们知道标记的类型将为a
。 This should sufficient (I haven't been able to test the code though). 这应该足够了(尽管我还无法测试代码)。
I optimized the code a bit, since the DOMNode
class inherits from DOMElement
you can use the hasAttribute
and getAttribute
instead. 我对代码进行了一些优化,因为DOMNode
类是从DOMElement
继承的,因此您可以使用hasAttribute
和getAttribute
代替。
foreach($doc->getElementsByTagName('a') as $element ) {
if ($element->hasAttribute('href')) {
$href = $element->getAttribute('href');
// Do your work here
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.