简体   繁体   中英

Handling CDATA when parsing RSS Feed with JAVA

I followed Vogella's tutorial for parsing an RSS Feed using JAVA. The code is straightforward and I was able to get it to work. The problem is some of the nodes im parsing have CDATA, and I'm getting empty strings (based on the way the parser is implemented).

In short, my question is, is there an easy way to modify this implementation to handle CDATA?

Vogella RSS Parser

It handles CDATA, the parser unfortunately just returns the value after reading the first line, so in cases like this

<description>
  <![CDATA[
  Lorem ipsum..
  ]]>
</description>

It will not read until the end of the element. You should change the RSSFeedParser.getCharacterData method to something like this:

private String getCharacterData(XMLEvent event, XMLEventReader eventReader)
        throws XMLStreamException {
    StringBuilder result = new StringBuilder();
    while (!(event = eventReader.nextEvent()).isEndElement()) {
        if (event instanceof Characters) {
            result.append(event.asCharacters().getData());
        }
    }
    return result.toString();
}

Now the content of description tag will be "\\nLorem ipsum..\\n"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM