Im trying to get the data of < "b" > #8 (with the random text).
right now i have this code that give me the TITLE that I need
Document doc = Jsoup.connect("");
Elements title = doc.select("div.column.two-third");
Element k = title.select("b").get(8);
but I also need the text. It is possible?
sample of the website's code Im trying to get data
<div class="column two-third">
<div style="color:#000000">
<b>Title I dont want:</b> random text // "b" #0 </br>
<b>Title i dont want</b> random text // "b" #1 <br>
<b>Title i dont want:</b> random text // "b" #7 <br>
**<b>TITLE I WANT :</b> random text // "b" #8 <br>**
<div align="justify"> <br> <br> <b style="color:#000000">text i dont want</b><br>
As you can see, the text that I want is in a Div without class or ID, also, the text is without tags ("p"). I just need #8. It is possible?
Yes, it is possible: the text after the <b>
tag is in a text node. So after grabbing the correct <b>
tag, jsoup let you select a specific node in the parent element with .parent().childNode(int index)
( https://jsoup.org/apidocs/org/jsoup/nodes/Node.html#childNode-int- ) and gives you the index of the targeted <b>
element with .siblingIndex()
, so just get the childNode at the index incremented by 1.
Example Code
Elements bTags = htmlDocument.select("div.column.two-third > div b");
if(bTags.size()>8){
Element title = bTags.get(8);
String text = title.parent().childNode(title.siblingIndex()+1).toString();
System.out.println(title.text() + "\n" + text);
}
Output
TITLE I WANT :
random text
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.