简体   繁体   English

如何在JSOUP中选择此元素?

[英]How do I select this element in JSOUP?

This is the HTML structure: 这是HTML结构:

在此输入图像描述

Element link = doc.select("div.subtabs p").first();

That does not seem to work. 这似乎不起作用。 How do I select that p ? 我该如何选择p

The DIV with the class="subtabs" is not in fact the parent of the p element but instead is the sibling of p . 具有class =“subtabs”的DIV实际上不是p元素的父元素,而是p的兄弟元素。 To retrieve the p , you'll need to first get a reference to the parent DIV that has the id="content": 要检索p ,您需要首先获得对id =“content”的父DIV的引用:

Element link = doc.select("div#content > p").first();

Additionally, you'll need the > symbol to indicate that you're selecting a child of div#content. 此外,您还需要>符号来表示您正在选择div#content的子项。

parent > child: child elements that descend directly from parent, eg div.content > p finds p elements; parent> child:直接从父级下降的子元素,例如div.content> p查找p元素; and body > * finds the direct children of the body tag 和body> *找到body标签的直接子节点

If you get stuck with a JSOUP CSS selector in the future, check out the JSOUP Selector Syntax cookbook , which has some nice examples and explanations. 如果您将来遇到JSOUP CSS选择器,请查看JSOUP Selector Syntax cookbook ,其中有一些很好的示例和解释。

div#content p . div#content p It is not a child of .subtabs . 它不是.subtabs的孩子。

The p tag you are trying to extract is not a child of the div . 您尝试提取的p标记不是div的子标记。 It is a sibling. 这是一个兄弟姐妹。 The parent div's id is content and the p tag you want is the first p tag within its parent. 父div的id是content ,你想要的p标签是其父级中的第一个p标签。 So use doc.select("div#content > p").first(); 所以使用doc.select("div#content > p").first();

The # means id and > means RHS is a child to LHS. 表示id和>表示RHS是LHS的孩子。 So the statement means get first paragraph which is child to div with id as content 所以声明意味着得到第一个段为div的id为div的内容

The Chrome SelectorGadget is very helpful in constructing CSS selectors for jSoup, simply by point and click. Chrome SelectorGadget非常有助于为jSoup构建CSS选择器,只需点击即可。 It has saved me hours of development time when trying to target specific fields. 在尝试定位特定字段时,它节省了我数小时的开发时间。

Element link = doc.select("div.subtabs + p")它找到紧跟兄弟之前的元素

试试这个:

Element link = doc.select("div.subtabs > p").first();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM