简体   繁体   English

如何将 XPath contains() 用于特定文本?

[英]How to use XPath contains() for specific text?

Say we have an HTML table which basically looks like this:假设我们有一个基本上如下所示的 HTML 表格:

2|1|28|9|
3|8|5|10|
18|9|8|0|

I want to select the cells which contain only 8 and nothing else, that is, only 2nd cell of row2 and 3rd cell of row3.我想选择只包含 8 个而不包含其他任何内容的单元格,即只有 row2 的第 2 个单元格和 row3 的第 3 个单元格。

This is what I tried: //table//td[contains(.,'8')] .这就是我尝试过的: //table//td[contains(.,'8')] It gives me all cells which contain 8. So, I get unwanted values 28 and 18 as well.它给了我所有包含 8 的单元格。所以,我也得到了不需要的值 28 和 18。

How do I fix this?我该如何解决?

EDIT: Here is a sample table if you want to try your xpath.编辑:如果您想尝试 xpath,这是一个示例表。 Use the calendar on the left side- https://sfbay.craigslist.org/sfc/使用左侧的日历 - https://sfbay.craigslist.org/sfc/

Be careful of the contains() function.小心contains()函数。

It is a common mistake to use it to test if an element contains a value .使用它来测试元素是否包含值是一个常见的错误 What it really does is test if a string contains a substring .它真正做的是测试一个字符串是否包含一个子字符串 So, td[contains(.,'8')] takes the string value of td ( . ) and tests if it contains any '8' substrings.因此, td[contains(.,'8')]获取td ( . ) 的字符串值并测试它是否包含任何'8'子字符串。 This might be what you want, but often it is not.这可能是您想要的,但通常不是。

This XPath,这个 XPath,

//td[.='8']

will select all td elements whose string-value equals 8 .将选择所有 字符串值等于8td元素。

Alternatively, this XPath,或者,这个 XPath,

//td[normalize-space()='8']

will select all td elements whose normalize-space() string-value equals 8 .将选择normalize-space() string-value等于8的所有td元素。 (The normalize-space() XPath function strips leading and trailing whitespace and replaces sequences of whitespace characters with a single space.) normalize-space() XPath 函数去除前导和尾随空格,并用单个空格替换空白字符序列。)

Notes:笔记:

  • Both will work even if the 8 is inside of another element such as a a , b , span , div , etc.即使 8 在另一个元素(如 a abspandiv等)内,两者都可以工作。
  • Both will not match <td>gr8t</td> , <td>123456789</td> , etc.两者都不会匹配<td>gr8t</td><td>123456789</td>等。
  • Using normalize-space() will ignore leading or trailing whitespace surrounding the 8 .使用normalize-space()将忽略8周围的前导或尾随空格。

See also:也可以看看:

Try the following xpath, which will select the whole text contents rather than partial matches:尝试以下 xpath,它将选择整个文本内容而不是部分匹配:

//table//td[text()='8']

Edit: Your example HTML has a tags inside the td elements, so the following will work:编辑:您的示例 HTML 在 td 元素中有一个标签,因此以下内容将起作用:

//table//td/a[text()="8"]

See example in php here: https://3v4l.org/56SBn请参阅此处的 php 示例: https ://3v4l.org/56SBn

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM