简体   繁体   English

JSoup不解析元素

[英]JSoup does not parse elements

I`m trying to get the text between the "" (info.net, test.com etc) of below list with jsoup and add them to an arraylist. 我正在尝试使用jsoup获取以下列表的“”(info.net,test.com等)之间的文本,并将其添加到arraylist中。

<?xml version="1.0" encoding="UTF-8"?>
<supported-filehosts>
<host url="info.net"/>
<host url="test.com"/>
<host url="app.to"/>
</supported-filehosts>

Seems easy but I can`t get it right with below code: 看起来很简单,但是我无法通过以下代码来正确处理:

Document doc = Jsoup.parse(HOSTS);
Elements links = doc.select("host[url]");

for (Element link : links) {
    hostUrls.add(link.text());
}

Could somebody have a look. 有人可以看看。

You need to get the attribute from each tag: 您需要从每个标签获取attribute

hostUrls.add(link.attr("url"));

Example code: 示例代码:

public class Main { 

    public static void main(String[] args) throws IOException {


        final String HOSTS = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"+
            "<supported-filehosts>"+
            "<host url=\"info.net\"/>"+
            "<host url=\"test.com\"/>"+
            "<host url=\"app.to\"/>"+
            "</supported-filehosts>";
        List<String> hostUrls = new ArrayList<String>();

        Document doc = Jsoup.parse(HOSTS);
        Elements links = doc.select("host[url]");
        for (Element link : links) {
            hostUrls.add(link.attr("url"));
        }
        System.out.println(hostUrls.toString());
    }   
}  

Output: 输出:

[info.net, test.com, app.to] [info.net,test.com,app.to]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM