简体   繁体   English

Jsoup在没有类或标签的情况下获取不同标签之间的特定数据

[英]Jsoup Getting specific data between different tags without class or tags

Im trying to get the data of < "b" > #8 (with the random text). 我试图获取<“b”>#8的数据(随机文本)。

right now i have this code that give me the TITLE that I need 现在我有这个代码,给我我需要的标题

Document doc = Jsoup.connect(""); 
Elements title = doc.select("div.column.two-third"); 
Element k = title.select("b").get(8);

but I also need the text. 但我也需要文字。 It is possible? 有可能的?


sample of the website's code Im trying to get data 我试图获取数据的网站代码示例

<div class="column two-third"> 

<div style="color:#000000">                                   

<b>Title I dont want:</b> random text // "b" #0  </br>

<b>Title i dont want</b> random text // "b" #1  <br>

<b>Title i dont want:</b> random text // "b" #7  <br>

**<b>TITLE I WANT :</b> random text // "b" #8  <br>**

<div align="justify"> <br> <br>   <b style="color:#000000">text i dont want</b><br>

As you can see, the text that I want is in a Div without class or ID, also, the text is without tags ("p"). 正如您所看到的,我想要的文本是没有类或ID的Div,文本也没有标签(“p”)。 I just need #8. 我只需要#8。 It is possible? 有可能的?

Yes, it is possible: the text after the <b> tag is in a text node. 是的,有可能: <b>标签之后的文本位于文本节点中。 So after grabbing the correct <b> tag, jsoup let you select a specific node in the parent element with .parent().childNode(int index) ( https://jsoup.org/apidocs/org/jsoup/nodes/Node.html#childNode-int- ) and gives you the index of the targeted <b> element with .siblingIndex() , so just get the childNode at the index incremented by 1. 因此,在获取正确的<b>标记后,jsoup允许您使用.parent().childNode(int index)选择父元素中的特定节点.parent().childNode(int index)https://jsoup.org/apidocs/org/jsoup/nodes/Node .html#childNode-int- )并使用.siblingIndex()为您提供目标<b>元素的索引,因此只需将索引处的childNode增加1即可。

Example Code 示例代码

Elements bTags = htmlDocument.select("div.column.two-third > div b");

if(bTags.size()>8){
    Element title = bTags.get(8);
    String text = title.parent().childNode(title.siblingIndex()+1).toString();
    System.out.println(title.text() + "\n" + text);
}

Output 产量

TITLE I WANT :
random text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM