简体   繁体   English

JAVA / SAX-使用XML解析器的字符丢失

[英]JAVA/SAX - Loss of characters using XML Parser

I'm using SAX Parser to parse the XML file of RSS feeds on an Android App and sometimes the parsing of the pubDate of an item isn't completed (incomplete characters). 我正在使用SAX Parser来解析Android App上RSS feed的XML文件,有时某个项目的pubDate的解析未完成(不完整的字符)。

Ex: 例如:

Actual PubDate Thu, 02 Apr 2015 12:23:41 +0000 实际发布日期星期四,2015年4月2日12:23:41 +0000

PubDate Result of the parse: Thu, 解析的PubDate结果:周四,

Here is the code that I'm using in the parser handler: 这是我在解析器处理程序中使用的代码:

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if ("item".equalsIgnoreCase(localName)) {
            currentItem = new RssItem(url);
        } else if ("title".equalsIgnoreCase(localName)) {
            parsingTitle = true;
        } else if ("link".equalsIgnoreCase(localName)) {
            parsingLink = true;
        } else if ("pubDate".equalsIgnoreCase(localName)) {
            parsingDate = true;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if ("item".equalsIgnoreCase(localName)) {
            rssItems.add(currentItem);
            currentItem = null;
        } else if ("title".equalsIgnoreCase(localName)) {
            parsingTitle = false;
        } else if ("link".equalsIgnoreCase(localName)) {
            parsingLink = false;
        } else if ("pubDate".equalsIgnoreCase(localName)) {
            parsingDate = false;
        }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        if (parsingTitle) {
            if (currentItem != null) {
                currentItem.setTitle(new String(ch, start, length));
                parsingTitle = false;
            }
        } else if (parsingLink) {
            if (currentItem != null) {
                currentItem.setLink(new String(ch, start, length));
                parsingLink = false;
            }
        } else if (parsingDate) {
            if (currentItem != null) {
                currentItem.setDate(new String(ch, start, length));
                parsingDate = false;
            }
        }
    }

The loss of characters is pretty random, it happens in different XML items every time I run the app. 字符丢失是非常随机的,每次我运行该应用程序时,它会在不同的XML项目中发生。

You are assuming that there is exactly one characters() call per element. 您假设每个元素仅调用一个characters() That is not a safe assumption. 这不是一个安全的假设。 Build up your string over 1+ calls to characters() , then apply it in endElement() . 通过对characters() 1次以上的调用来构建您的字符串,然后将其应用于endElement()

Or, better yet, use any one of a number of existing RSS parser libraries . 或者,更好的是,使用许多现有RSS解析器库中的任何一个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM