简体   繁体   English

使用jSoup查找具有某些特定文本的节点

[英]Using jSoup to find a node with some particular text

How can I find this snippet of HTML in a node using jSoup: 如何使用jSoup在节点中找到此HTML片段:

<span style="font-weight: bold">Party Date:</span> 14.08.2012<br>

I'd like to extract the date from the HTML snippet. 我想从HTML代码段中提取日期。 The problem is that this snippet of HTML can occur anywhere within an Element so I need to match it using the contained text. 问题是这个HTML片段可以出现在Element中的任何地方,所以我需要使用包含的文本来匹配它。

If you are still looking for jsoup selector query.. this works for me.. 如果你还在寻找jsoup选择器查询...这对我有用..

    String html = "<span style=\"font-weight: bold\">Party Date:</span> 14.08.2012<br>";

    System.out.println("Date " + Jsoup.parse(html).select("span:matchesOwn(Party Date:)").first().nextSibling().toString());

As you have tagged the question "xpath", I am going to assume that you will accept an XPATH solution. 正如您标记了问题“xpath”,我将假设您将接受XPATH解决方案。 In the absence of information to the contrary, I will make some reasonable assumptions. 如果没有相反的信息,我会做出一些合理的假设。 Please let us know if you want to correct or refine these assumptions. 如果您想纠正或改进这些假设,请告诉我们。

Assumptions 假设

  1. The is exactly one span element in the document with text value 'Party Date:' . 这正是文档中的一个span元素,文本值为“Party Date:”。
  2. The 'Part Date:' text is exactly as is. 'Part Date:'文本完全一样。 Never with leading or trailing white-space nor variation in case. 永远不会有前导或尾随的空白区域,也不会有变化。
  3. The text node following the said span contains the target value. 所述跨度之后的文本节点包含目标值。
  4. The said span element can occur anywhere in the document. 所述跨度元素可以出现在文档中的任何位置。
  5. The style attribute is immaterial to the question. style属性对于这个问题并不重要。

XPath expression XPath表达式

The following XPATH expression... 以下XPATH表达式......

//span[.='Party Date:'][1]/following-sibling::text()

...returns... ... ...返回

' 14.08.2012'

Note: This works in both XPATH 1.0 and XPATH 2.0 注意:这适用于XPATH 1.0和XPATH 2.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM