I am trying to get text from the HTML document
<p>
<b>1</b>First Text
<b>2</b><br>Second Text
<b>3</b>Third Text
.
.
.
.
</p>
line no 3 is where things gets stuck
Tried with below code, but there occurs an error.
Elements elements = doc.body().select("p").select("b");
for(int i=0; i<elements.size(); i++)
{
Element val = elements.get(i);
if ((val.nextSibling().toString().trim()).equals(""))
System.out.println(val.nextSibling().toString().select("br").first().text()+"\n");
else
System.out.println(val.nextSibling().toString()+"\n");
}
The question is unclear but it seems (from the title) that you only want the text outside the <b>
's after the <br>
. For this you can use the ownText()
:
Elements elements = doc.select("p");
for(Element p: elements) {
System.out.println(p.ownText()); // Prints text that is in <p> but not in <b>
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.