[英]Find top-most following elements with XPath
In XPath, I know I can select all following elements with /following::*
, however I'd like to avoid also selecting the children contained within any following elements.在 XPath 中,我知道我可以 select 以下所有元素都带有
/following::*
,但是我想避免也选择包含在以下任何元素中的子元素。
For example, given this document:例如,给定这个文档:
<body>
<div id="div1">
<p id="p1">...</p>
<p id="p2">
<span id="span1"></span>
<span id="span2"><i id="i1">...</i></span>
</p>
<p id="p3">...</p>
</div>
<div id="div2">
<p id="p4">...</p>
<p id="p5">...</p>
</div>
</body>
If I have span1
selected, I would like to select span2
(but not i1
), p3
, and div2
(but not p4
or p5
).如果我选择了
span1
,我想 select span2
(但不是i1
)、 p3
和div2
(但不是p4
或p5
)。
In Python, my code might look something like:在 Python 中,我的代码可能类似于:
>>> lxml.html.fromstring(document).xpath('//*[@id="span1"]/following::*')
[<Element span at 0x1082bd680>,
<Element i at 0x1082bd4f0>,
<Element p at 0x1082bd770>,
<Element div at 0x1082bd360>,
<Element p at 0x1082bd7c0>,
<Element p at 0x1082bdef0>]
But what I'd like to have returned is:但我想要返回的是:
[<Element span at 0x1082bd680>,
<Element p at 0x1082bd770>,
<Element div at 0x1082bd360>]
EDIT: @kjhughes answer got me 90% of the way there.编辑:@kjhughes 的回答让我走了 90% 的路。 Because the real life example might not have a ID that I can easily use to match, I ended up writing code like:
因为现实生活中的示例可能没有我可以轻松使用的 ID 来匹配,所以我最终编写了如下代码:
find_following = lxml.html.etree.XPath(
"following::*[not(../preceding::*[. = node()])]"
)
This XPath,此 XPath,
//*[@id="span1"]/following::*[not(../preceding::*[@id="span1"])]
selects the elements following the targeted span
element whose parents do not have the targeted span
element as a predecessor,选择目标
span
元素后面的元素,其父级没有目标span
元素作为前任,
<span id="span2"><i id="i1">...</i></span>
<p id="p3">...</p>
<div id="div2"> <p id="p4">...</p> <p id="p5">...</p> </div>
as requested.按照要求。
XPath 3.1 has the function outermost()
: outermost(following::*)
selects all following elements excluding any that are descendants of another element in the node-set. XPath 3.1 具有 function
outermost()
: outermost(following::*)
选择以下所有元素,不包括节点集中另一个元素的后代。
XPath 2.0 allows following::* except following::*/descendant::*
. XPath 2.0 允许
following::* except following::*/descendant::*
。
In XPath 1.0 you can express ($A except $B)
as $A[count(.|$B)=count($B)]
.在 XPath 1.0 中,您可以将
($A except $B)
表示为$A[count(.|$B)=count($B)]
。 (Though this isn't all that useful because there's no way within XPath itself of binding a variable). (尽管这并不是那么有用,因为 XPath 本身无法绑定变量)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.