xpath //[.="Foo"] 和 xpath //["Foo"] 在谓词中只有一个字符串有什么区别？

Question

Doing some testing in iPython with a new (to me) way of selecting nodes based on their text in an xpath.在 iPython 中使用一种新的（对我而言）根据 xpath 中的文本选择节点的方法进行一些测试。 (Irrelevant lines omitted for clarity) （为清楚起见，省略了无关的行）

In [26]: from lxml import etree

In [41]: string = '''
    ...: <outer>
    ...:    <mid>
    ...:       <inner>Foo</inner>
    ...:    </mid>
    ...: </outer>
    ...: '''

In [43]: root = etree.fromstring(string)

In [44]: root.xpath('//inner[text()="Foo"]')
Out[44]: [<Element inner at 0x10a0387c0>]

In [45]: root.xpath('//inner[.="Foo"]')
Out[45]: [<Element inner at 0x10a0387c0>]

In [47]: root.xpath('//inner["Foo"]')
Out[47]: [<Element inner at 0x10a0387c0>]

That all makes sense to me so far.到目前为止，这一切对我来说都是有意义的。 However:然而：

In [48]: root.xpath('//*[text()="Foo"]')
Out[48]: [<Element inner at 0x10a0387c0>]

In [49]: root.xpath('//*[.="Foo"]')
Out[49]: [<Element inner at 0x10a0387c0>]

In [50]: root.xpath('//*["Foo"]')
Out[50]: 
[<Element outer at 0x10a188200>,
 <Element mid at 0x10a01d6c0>,
 <Element inner at 0x10a0387c0>]

I had expected the second and third xpaths to produce the same result by matching all three nodes.我曾期望第二个和第三个 xpath 通过匹配所有三个节点来产生相同的结果。 Can anyone explain why they're different?谁能解释他们为什么不同？

Answer 1

Spec says规格说

A PredicateExpr is evaluated by evaluating the Expr and converting the result to a boolean. PredicateExpr 通过评估 Expr 并将结果转换为 boolean 来评估。 If the result is a number, the result will be converted to true if the number is equal to the context position and will be converted to false otherwise;如果结果是一个数字，如果数字等于上下文 position，则结果将被转换为 true，否则将被转换为 false； if the result is not a number, then the result will be converted as if by a call to the boolean function .如果结果不是数字，则结果将被转换，就像通过调用 boolean function 一样。 Thus a location path para[3] is equivalent to para[position()=3] .因此位置路径para[3]等价于para[position()=3] 。

(emphasis mine). （强调我的）。 Thus因此

root.xpath('//*["Foo"]')

is equivalent to相当于

root.xpath('//*["Lemon Pie"]')

It does not test for content of your <inner> node;它不会测试您的<inner>节点的内容； in fact, because "Foo" is a truthy literal, AFAIK it is also equivalent to事实上，因为"Foo"是一个真实的文字，AFAIK它也相当于

root.xpath('//*')

As Barmar said, the first and second expression do not match nodes other than <inner> because of whitespace.正如 Barmar 所说，由于空格，第一个和第二个表达式不匹配<inner>以外的节点。 To get all three, trim (or "normalize space", in XPath language):要获得所有三个，修剪（或“规范化空间”，在 XPath 语言中）：

root.xpath('//*[normalize-space()="Foo"]')

Answer 2

To add to Amadan's answer, your first and second expressions are not equivalent.为了增加 Amadan 的答案，您的第一个和第二个表达式不等价。 Both of them will match两个都会匹配

<Inner>Foo</Inner>

but they give different results for constructs such as但它们对结构给出不同的结果，例如

<Inner><span>Foo</span></Inner>

or或者

<Inner>Foo<nbsp/>Bar</Inner>

As a general rule, 9 times out of 10 when someone uses text() they should change it to .作为一般规则，当有人使用text()时，10 次中有 9 次应该将其更改为. . .

xpath //[.="Foo"] 和 xpath //["Foo"] 在谓词中只有一个字符串有什么区别？

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-04-30 00:26:35

解决方案2
0 2021-04-30 07:04:46

xpath //*[.="Foo"] 和 xpath //*["Foo"] 在谓词中只有一个字符串有什么区别？

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-04-30 00:26:35

解决方案2 0 2021-04-30 07:04:46

xpath //[.="Foo"] 和 xpath //["Foo"] 在谓词中只有一个字符串有什么区别？

解决方案1
2 已采纳 2021-04-30 00:26:35

解决方案2
0 2021-04-30 07:04:46