简体   繁体   中英

How to extract particular ownText JSOUP

String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(html);
Element link = doc.select("a").first();

String text = doc.body().text(); // "An example link"
String linkHref = link.attr("href"); // "http://example.com/"
String linkText = link.text(); // "example""

String linkOuterH = link.outerHtml(); 
    // "<a href="http://example.com"><b>example</b></a>"
String linkInnerH = link.html(); // "<b>example</b>"

In this example i want to just extract "An","link" so on. For example I want to get the text before the node ie "An" and after the node ie "link." in the above example

To get the text of the paragraph, select it and ask for its own text:

Element p = doc.select("p").first();
System.out.println(p.ownText());

if you want the parts that construct that text, you can traverse the child nodes of that element and select only those that are TextNode instances:

for (Node node :p.childNodes()){
    if (node instanceof TextNode){
        System.out.println(((TextNode)node).text()); 
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM