Jsoup .select returns empty value but element does contains text

Question

I'm trying to get the text of "link" tag element in this xml: http://www.istana.gov.sg/latestupdate/rss.xml

I have coded to get the first article.

        URL = getResources().getString(R.string.istana_home_page_rss_xml);
        // URL = "http://www.istana.gov.sg/latestupdate/rss.xml";

        try {
            doc = Jsoup.connect(URL).ignoreContentType(true).get();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        // retrieve the link of the article
        links = doc.select("link");

        // retrieve the publish date of the article
        dates = doc.select("pubDate");

        //retrieve the title of the article
        titles = doc.select("title");

        String[] article1 = new String[3];
        article1[0] = links.get(1).text();
        article1[1] = titles.get(1).text();
        article1[2] = dates.get(0).text();

The article comes out nicely but the link returns "" value (The whole entire link elements return "" value). The titles and dates have no problems. The link tag consist of a URL text. Anyone knows why it returns "" value?

Answer 1

It looks like default HTML parser can't recognize <link> as valid tag and is automatically closing it <link /> which means that content of this tag is empty.

To solve this problem instead of HTML parser you can use XML parser which doesn't care that much about tag names.

doc = Jsoup.connect(URL)
      .ignoreContentType(true)
      .parser(Parser.xmlParser()) // <-- add this
      .get();

Jsoup .select returns empty value but element does contains text

Question

1 answers

solution1
3 ACCPTED 2014-12-30 15:48:02

Jsoup .select returns empty value but element does contains text

Question

1 answers

solution1 3 ACCPTED 2014-12-30 15:48:02

solution1
3 ACCPTED 2014-12-30 15:48:02