繁体   English   中英

使用Java解析XML中的自动关闭标签

[英]Self-closing tag in xml parsing using java

我正在尝试解析xml文件以从rss feed获得新闻。 但是自关闭,空标签无法正确解析。 有些项目没有描述标签。

我的xml看起来像这样

<rss version="2.0">
  <channel>
    <title>Tata Group</title>
    <link>http://www.tata.com</link>
    <description>
    Tata is a rapidly growing business group based in India with significant international operations. The business operations of the Tata Group currently encompass seven business sectors: communications and information technology, engineering, materials, services, energy, consumer products and chemicals.
    </description>
    <copyright>Copyright (C) 2014 Tata Sons Ltd</copyright>
    <item>
      <title>Tata Power commissions waste water recovery plant at power house #6, Jamshedpur
      </title>
      <link>
        http://www.tata.com/rssread.aspx?artid=tiEeXsbwZ54=
      </link>
      <description>Jamshedpur: Tata Power, India's largest integrated power company, has adopted several innovative technological solutions to improve the plant processes at its generation faciliti...
      </description>
      <pubDate>22 Jul 2014 12:00:00 GMT</pubDate>
    </item>
    </channel>
    </rss>

当前我得到的是,我可以看到所有具有非空描述标签的项目,并且所有具有空自闭合描述标签的项目都被跳过了。

try {
        XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
        XmlPullParser xpp = factory.newPullParser();
        FileReader xmlReader = new FileReader(destination);
        xpp.setInput(xmlReader);
        int eventType = xpp.getEventType();
        String NodeValue;
        while (eventType != XmlPullParser.END_DOCUMENT) {
            switch (eventType) {
            case XmlPullParser.START_DOCUMENT:
                break;
            case XmlPullParser.START_TAG:
                NodeValue = xpp.getName();// Start of a Node
                if (NodeValue.equalsIgnoreCase("item")) {
                    flagItem = true;
                } else if (NodeValue.equalsIgnoreCase("title") && flagItem) {
                    eventType = xpp.next();
                    if (eventType == XmlPullParser.TEXT) {
                        message.setTitle(xpp.getText());
                    }
                } else if (NodeValue.equalsIgnoreCase("description/") && flagItem) {
                    message.setDescription("Description not available..");
                    Log.out(logFlag, logTag, "Reaching the critical point...........  self closing tag reached!!!");
                    flagItem = false;
                    list.add(message);
                    message = null;
                    message = new Message();

                } else if (NodeValue.equalsIgnoreCase("description") && flagItem) {
                    eventType = xpp.next();
                    if (eventType == XmlPullParser.TEXT) {
                        message.setDescription(xpp.getText());
                        flagItem = false;
                        list.add(message);
                        message = null;
                        message = new Message();
                    }
                }
                break;
            }
            eventType = xpp.next();
            Log.out(logFlag, logTag, "xml file downloaded : "+list);

case XmlPullParser.END_TAG:
    // ...
    break;

对于一般情况,您可能必须跟踪“打开”元素。 但是,由于您只对<description>感兴趣,因此您可能可以使用在看到START_TAG时设置,清除的标志。 空元素(如<description/> )报告为START_TAG,后跟END_TAG。

这绝对是不正确的,因为绝对不是本地名称中的“ /”:

NodeValue.equalsIgnoreCase("description/")

省略字符串中的“ /”。

后来我注意到equalsIgnoreCase方法调用。 您知道XML是区分大小写的wrt元素名称吗?

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM