简体   繁体   中英

Fetching invalid structured attributes with Simple Html DOM

I am trying to fetch an anchor tag, but it's structured poorly. Something like this:

<a  target='_blank' title='Some'href='somelink.html'>Link 1</a>

I tried to obtain the valid attributes by using

foreach($html->find('a') as $links)
 {
     var_dump($links->attr);
 }

And the var_dump clearly shows that the href attribute is not listed among the other attributes.

How do I find the anchor?

$html = "<a  target='_blank' title='Some'href='somelink.html'>Link 1</a>";
//or $html = file_get_contents('example.html');
$dom = new DOMDocument();
@$dom->loadHTML($html);
$a = $dom->getElementsByTagName('a');

$nodes = array();
for ($i = 0; $i < $a->length; $i++) {
    $attr1 = $a->item($i)->getAttribute('target');
    $attr2 = $a->item($i)->getAttribute('title');
    $attr3 = $a->item($i)->getAttribute('href');
    $nodes[] = array('target'=>$attr1, 'title'=>$attr2, 'href'=>$attr3);
}
print_r($nodes);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM