在PHP XPath查询中按名称空间获取HTML标签

Question

Let's say I have the following HTML snippet: 假设我有以下HTML代码段：

<div abc:section="section1">
  <p>Content...</p>
</div>
<div abc:section="section2">
  <p>Another section</p>
</div>

How can I get a DOMNodeList (in PHP) with a DOMNode for each of <div> 's with the abc:section attribute set. 如何为设置了abc:section属性的每个<div>的DOMNode获取一个DOMNodeList（在PHP中）。

Currently I have the following code 目前我有以下代码

$dom = new DOMDocument();
$dom->loadHTML($html)

$xpath = new DOMXPath($dom);
$xpath->registerNamespace('abc', 'http://xml.example.com/AbcDocument');

Following XPath's won't work: 遵循XPath将不起作用：

$xpath->query('//@abc:section');
$xpath->query('//*[@abc:section]');

The loaded HTML is always just a snippet, I'm transforming this using the DOMDocument functions and feeding that to the template. 加载的HTML始终只是一个片段，我正在使用DOMDocument函数对其进行转换，并将其提供给模板。

Answer 1

The loadHTML method will trigger the HTML Parser module of libxml . loadHTML方法将触发libxml的HTML Parser模块。 Afaik, the resulting HTML tree will not contain namespaces, so querying them with XPath wont work here. Afaik，生成的HTML树将不包含名称空间，因此在此处无法使用XPath查询它们。 You can do 你可以做

$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
foreach ($dom->getElementsByTagName('div') as $node) {
    echo $node->getAttribute('abc:section');
}
echo $dom->saveHTML();

As an alternative, you can use //div/@* to fetch all attributes and that would include the namespaced attributes. 或者，您可以使用//div/@*来获取所有属性，其中包括命名空间的属性。 You cannot have a colon in the query though, because that requires the namespace prefix to be registered but like pointed out above, that doesnt work for an HTML tree. 但是，您不能在查询中有一个冒号，因为这需要注册名称空间前缀，但是如上所述，对于HTML树而言，它不起作用。

Yet another alternative would be to use //@*[starts-with(name(), "abc:section")] . 另一种选择是使用//@*[starts-with(name(), "abc:section")] 。

在PHP XPath查询中按名称空间获取HTML标签

问题描述

1 个解决方案

解决方案1
1 已采纳 2011-04-05 13:26:47

在PHP XPath查询中按名称空间获取HTML标签

问题描述

1 个解决方案

解决方案1 1 已采纳 2011-04-05 13:26:47

解决方案1
1 已采纳 2011-04-05 13:26:47