使用 XPath 查找最上面的元素

Question

In XPath, I know I can select all following elements with /following::* , however I'd like to avoid also selecting the children contained within any following elements.在 XPath 中，我知道我可以 select 以下所有元素都带有/following::* ，但是我想避免也选择包含在以下任何元素中的子元素。

For example, given this document:例如，给定这个文档：

<body>
    <div id="div1">
        <p id="p1">...</p>
        <p id="p2">
            <span id="span1"></span>
            <span id="span2"><i id="i1">...</i></span>
        </p>
        <p id="p3">...</p>
    </div>
    <div id="div2">
        <p id="p4">...</p>
        <p id="p5">...</p>
    </div>
</body>

If I have span1 selected, I would like to select span2 (but not i1 ), p3 , and div2 (but not p4 or p5 ).如果我选择了span1 ，我想 select span2 （但不是i1 ）、 p3和div2 （但不是p4或p5 ）。

In Python, my code might look something like:在 Python 中，我的代码可能类似于：

>>> lxml.html.fromstring(document).xpath('//*[@id="span1"]/following::*')
[<Element span at 0x1082bd680>,
 <Element i at 0x1082bd4f0>,
 <Element p at 0x1082bd770>,
 <Element div at 0x1082bd360>,
 <Element p at 0x1082bd7c0>,
 <Element p at 0x1082bdef0>]

But what I'd like to have returned is:但我想要返回的是：

[<Element span at 0x1082bd680>,
 <Element p at 0x1082bd770>,
 <Element div at 0x1082bd360>]

EDIT: @kjhughes answer got me 90% of the way there.编辑：@kjhughes 的回答让我走了 90% 的路。 Because the real life example might not have a ID that I can easily use to match, I ended up writing code like:因为现实生活中的示例可能没有我可以轻松使用的 ID 来匹配，所以我最终编写了如下代码：

find_following = lxml.html.etree.XPath(
    "following::*[not(../preceding::*[. = node()])]"
)

Answer 1

This XPath,此 XPath，

//*[@id="span1"]/following::*[not(../preceding::*[@id="span1"])]

selects the elements following the targeted span element whose parents do not have the targeted span element as a predecessor,选择目标span元素后面的元素，其父级没有目标span元素作为前任，

<span id="span2"><i id="i1">...</i></span>
<p id="p3">...</p>
<div id="div2"> <p id="p4">...</p> <p id="p5">...</p> </div>

as requested.按照要求。

Answer 2

XPath 3.1 has the function outermost() : outermost(following::*) selects all following elements excluding any that are descendants of another element in the node-set. XPath 3.1 具有 function outermost() ： outermost(following::*)选择以下所有元素，不包括节点集中另一个元素的后代。

XPath 2.0 allows following::* except following::*/descendant::* . XPath 2.0 允许following::* except following::*/descendant::* 。

In XPath 1.0 you can express ($A except $B) as $A[count(.|$B)=count($B)] .在 XPath 1.0 中，您可以将($A except $B)表示为$A[count(.|$B)=count($B)] 。 (Though this isn't all that useful because there's no way within XPath itself of binding a variable). （尽管这并不是那么有用，因为 XPath 本身无法绑定变量）。

使用 XPath 查找最上面的元素

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-12-22 21:00:24

解决方案2
0 2021-12-22 23:10:02

使用 XPath 查找最上面的元素

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-12-22 21:00:24

解决方案2 0 2021-12-22 23:10:02

解决方案1
1 已采纳 2021-12-22 21:00:24

解决方案2
0 2021-12-22 23:10:02