简体   繁体   English

使用Jsoup在html标记后获取文本并显示结果

[英]Get text after after html tag using Jsoup and display the results

I am trying to get text from the HTML document 我正在尝试从HTML文档中获取文本

<p>
   <b>1</b>First Text
   <b>2</b><br>Second Text
   <b>3</b>Third Text
   .
   .
   .
   .
</p>

line no 3 is where things gets stuck 3号线是卡住的地方

Tried with below code, but there occurs an error. 尝试使用以下代码,但发生错误。

Elements elements = doc.body().select("p").select("b");
    for(int i=0; i<elements.size(); i++)
{
        Element val = elements.get(i);

    if ((val.nextSibling().toString().trim()).equals(""))
        System.out.println(val.nextSibling().toString().select("br").first().text()+"\n");
    else
        System.out.println(val.nextSibling().toString()+"\n");

    }

The question is unclear but it seems (from the title) that you only want the text outside the <b> 's after the <br> . 这个问题尚不清楚,但从标题看,您似乎只希望<br>之后的<b>之外的文本。 For this you can use the ownText() : 为此,您可以使用ownText()

Elements elements = doc.select("p");
for(Element p: elements) {
    System.out.println(p.ownText()); // Prints text that is in <p> but not in <b>
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM