简体   繁体   中英

Get HTML-tags by namespace in PHP XPath Query

Let's say I have the following HTML snippet:

<div abc:section="section1">
  <p>Content...</p>
</div>
<div abc:section="section2">
  <p>Another section</p>
</div>

How can I get a DOMNodeList (in PHP) with a DOMNode for each of <div> 's with the abc:section attribute set.

Currently I have the following code

$dom = new DOMDocument();
$dom->loadHTML($html)

$xpath = new DOMXPath($dom);
$xpath->registerNamespace('abc', 'http://xml.example.com/AbcDocument');

Following XPath's won't work:

$xpath->query('//@abc:section');
$xpath->query('//*[@abc:section]');

The loaded HTML is always just a snippet, I'm transforming this using the DOMDocument functions and feeding that to the template.

The loadHTML method will trigger the HTML Parser module of libxml . Afaik, the resulting HTML tree will not contain namespaces, so querying them with XPath wont work here. You can do

$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
foreach ($dom->getElementsByTagName('div') as $node) {
    echo $node->getAttribute('abc:section');
}
echo $dom->saveHTML();

As an alternative, you can use //div/@* to fetch all attributes and that would include the namespaced attributes. You cannot have a colon in the query though, because that requires the namespace prefix to be registered but like pointed out above, that doesnt work for an HTML tree.

Yet another alternative would be to use //@*[starts-with(name(), "abc:section")] .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM