简体   繁体   English

如何在jsoup中提取下一个标记元素

[英]How to extract the next tag elements in jsoup

I am using jsoup to parse html document . 我正在使用jsoup解析html文档。 I need the value of P tag just after the SPAN tag which contains id attribute. 我刚好在包含id属性的SPAN 标记之后需要P标记的值。

I am trying with the following code 我正在尝试以下代码

 Elements spanList = body.select("span");
    if (spanList != null) {
        for (Element element1 : spanList) {
            if (element1.attr("id").contains("midArticle")) {
                Element element = element1.after("<p>");  // This line is wrong 
                if (element != null) {
                    String text = element.text();
                    if (text != null && !text.isEmpty()) {
                        out.println(text);
                    }
                }
            }
        }
    }

The html sample code html示例代码

<span id="midArticle_9"></span>
<p>"The Director owes it to the American people to immediately provide the full details of what he is now examining," Podesta said in a statement. "We are confident this will not produce any conclusions different from the one the FBI reached in July." </p>
<span id="midArticle_10"></span>
<p>Clinton has repeatedly apologized for using the private email server in her home instead of a government email account for her work as secretary of state from 2009 to 2013. She has said she did not knowingly send or receive classified information.</p>

i hope this resolves your issue... 我希望这可以解决您的问题...

public static void main(String[] args) {
        String html = "<span id=\"midArticle_9\"></span><p>\"The Director owes it to the American people to immediately provide the full details of what he is now examining,\" Podesta said in a statement. \"We are confident this will not produce any conclusions different from the one the FBI reached in July.\" </p><span id=\"midArticle_10\"></span><p>Clinton has repeatedly apologized for using the private email server in her home instead of a government email account for her work as secretary of state from 2009 to 2013. She has said she did not knowingly send or receive classified information.</p>";
        Document document = Jsoup.parse(html);
        Elements elements = document.getElementsByTag("span");
        for (Element element : elements) {
            System.out.println(element.nextElementSibling().text());
        }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM