简体   繁体   English

解析XML时,标签异常不匹配

[英]Mismatched tag exception when parsing XML

This has become a real pain in my backside. 这已经成为我背面的真正痛苦。

The URL I'm trying to parse is http://torrentz.eu/feed_verifiedP?q=ubuntu 我要解析的URL是http://torrentz.eu/feed_verifiedP?q=ubuntu

Here's a short version of the xml: 这是xml的简短版本:

<?xml version="1.0"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
 <channel>
  <title>Torrentz - ubuntu</title>
  <link>http://torrentz.eu/verified?q=ubuntu</link>
  <description>ubuntu search</description>
  <language>en-us</language>
  <atom:link href="http://torrentz.eu/feed_verifiedP?q=ubuntu" rel="self" type="application/rss+xml" />
  <item>
     <title>ubuntu 11 10 desktop i386 iso</title>
     <link>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</link>
     <guid>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</guid>
     <pubDate>Thu, 13 Oct 2011 15:02:06 +0000</pubDate>
     <category>apps linux applications os software</category>
     <description>Size: 695 MB Seeds: 4,613 Peers: 161 Hash: 8ac3731ad4b039c05393b5404afa6e7397810b41</description>
  </item>
 </channel>
</rss>

My code: 我的代码:

    SAXParserFactory spf = SAXParserFactory.newInstance();
    SAXParser sp = spf.newSAXParser();
    XMLReader xr = sp.getXMLReader();

    //Get Torrents
    XMLTorrentsRSSHandler torrentsHandler = new XMLTorrentsRSSHandler();
    xr.setContentHandler(torrentsHandler);
    InputStream in = url.openStream();
    xr.parse(new InputSource(in));
    XMLTorrentsRSSParsedDataSet parsedTorrentsDataSet = torrentsHandler.getParsedData();

I keep getting this exception: 我不断收到此异常:

org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 53: mismatched tag

Why the flip does it torment me like this!? 为什么翻转会像这样折磨我!?

EDIT: This method was working fine until today. 编辑:这种方法直到今天工作良好。 Perhaps the website changed but where is this flippin' mismatched tag? 也许网站改变了,但是这个图钉的不匹配标签在哪里?

Why do you have Harmony on your build path? 为什么在构建路径上实现和谐? Your code works fine with the built-in SAXParser in Oracle's JDK7u3. 您的代码可以与Oracle JDK7u3中的内置SAXParser一起正常工作。 If there isn't a reason to be using the harmony implementation, you should revert to the standard one. 如果没有理由使用和声实现,则应恢复为标准实现。

Testcase form: 测试用例形式:

import java.io.IOException;
import java.io.StringReader;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;

class Scratch {
    public static void main(String[] args) throws IOException, SAXException, ParserConfigurationException {
        final String document = "<?xml version=\"1.0\"?>\n" +
                "<rss version=\"2.0\" xmlns:atom=\"http://www.w3.org/2005/Atom\">\n" +
                " <channel>\n" +
                "  <title>Torrentz - ubuntu</title>\n" +
                "  <link>http://torrentz.eu/verified?q=ubuntu</link>\n" +
                "  <description>ubuntu search</description>\n" +
                "  <language>en-us</language>\n" +
                "  <atom:link href=\"http://torrentz.eu/feed_verifiedP?q=ubuntu\" rel=\"self\" type=\"application/rss+xml\" />\n" +
                "  <item>\n" +
                "     <title>ubuntu 11 10 desktop i386 iso</title>\n" +
                "     <link>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</link>\n" +
                "     <guid>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</guid>\n" +
                "     <pubDate>Thu, 13 Oct 2011 15:02:06 +0000</pubDate>\n" +
                "     <category>apps linux applications os software</category>\n" +
                "     <description>Size: 695 MB Seeds: 4,613 Peers: 161 Hash: 8ac3731ad4b039c05393b5404afa6e7397810b41</description>\n" +
                "  </item>\n" +
                " </channel>\n" +
                "</rss>\n";

        SAXParserFactory spf = SAXParserFactory.newInstance();
        SAXParser sp = spf.newSAXParser();
        XMLReader xr = sp.getXMLReader();

        ContentHandler torrentsHandler = new DefaultHandler() {
            @Override
            public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
                System.out.printf("%s / %s / %s\n", uri, localName, qName);
            }
        };
        xr.setContentHandler(torrentsHandler);
        xr.parse(new InputSource(new StringReader(document)));
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM