简体   繁体   English

如何使用JSoup通过标签获取元素? -Java

[英]How to get element by tags using JSoup? - java

How to get element by tags using JSoup ( http://jsoup.org/ )? 如何使用JSoup( http://jsoup.org/ )通过标签获取元素?

I have the following input and require the following output but i am not getting the text inside the <source>...<\\source> tags: 我有以下输入,并需要以下输出,但我没有在<source>...<\\source>标记内获取文本:

[in:] [在:]

<html>
  <something>
    <source>foo bar bar</source>
  <something>
  <source>foo foo bar</source>
</html>

[desired out:] [要求:]

foo bar bar
foo foo bar

I have tried this: 我已经试过了:

import java.io.*;
import java.util.List;

import org.apache.commons.io.IOUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

public class HelloJsoup {
    public static void main(String[] args) throws IOException {

        String br = "<html><source>foo bar bar</source></html>";
        Document doc = Jsoup.parse(br);
        //System.out.println(doc);
        for (Element sentence : doc.getElementsByTag("source"))
            System.out.print(sentence);

    }
}

but it outputs: 但它输出:

<source></source>

You need to use the xmlParser() , which you can pass in to the parse() method: 您需要使用xmlParser() ,您可以将其传递给parse()方法:

String br = "<html><source>foo bar bar</source></html>";
Document doc = Jsoup.parse(br, "", Parser.xmlParser());

for (Element sentence : doc.getElementsByTag("source"))
    System.out.println(sentence.text());

}

More on this in the docs: http://jsoup.org/apidocs/org/jsoup/parser/Parser.html#xmlParser() 在文档中对此有更多了解: http : //jsoup.org/apidocs/org/jsoup/parser/Parser.html#xmlParser()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM