[英]Mismatched tag exception when parsing XML
這已經成為我背面的真正痛苦。
我要解析的URL是http://torrentz.eu/feed_verifiedP?q=ubuntu
這是xml的簡短版本:
<?xml version="1.0"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Torrentz - ubuntu</title>
<link>http://torrentz.eu/verified?q=ubuntu</link>
<description>ubuntu search</description>
<language>en-us</language>
<atom:link href="http://torrentz.eu/feed_verifiedP?q=ubuntu" rel="self" type="application/rss+xml" />
<item>
<title>ubuntu 11 10 desktop i386 iso</title>
<link>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</link>
<guid>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</guid>
<pubDate>Thu, 13 Oct 2011 15:02:06 +0000</pubDate>
<category>apps linux applications os software</category>
<description>Size: 695 MB Seeds: 4,613 Peers: 161 Hash: 8ac3731ad4b039c05393b5404afa6e7397810b41</description>
</item>
</channel>
</rss>
我的代碼:
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
//Get Torrents
XMLTorrentsRSSHandler torrentsHandler = new XMLTorrentsRSSHandler();
xr.setContentHandler(torrentsHandler);
InputStream in = url.openStream();
xr.parse(new InputSource(in));
XMLTorrentsRSSParsedDataSet parsedTorrentsDataSet = torrentsHandler.getParsedData();
我不斷收到此異常:
org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 53: mismatched tag
為什么翻轉會像這樣折磨我!?
編輯:這種方法直到今天工作良好。 也許網站改變了,但是這個圖釘的不匹配標簽在哪里?
為什么在構建路徑上實現和諧? 您的代碼可以與Oracle JDK7u3中的內置SAXParser一起正常工作。 如果沒有理由使用和聲實現,則應恢復為標准實現。
測試用例形式:
import java.io.IOException;
import java.io.StringReader;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
class Scratch {
public static void main(String[] args) throws IOException, SAXException, ParserConfigurationException {
final String document = "<?xml version=\"1.0\"?>\n" +
"<rss version=\"2.0\" xmlns:atom=\"http://www.w3.org/2005/Atom\">\n" +
" <channel>\n" +
" <title>Torrentz - ubuntu</title>\n" +
" <link>http://torrentz.eu/verified?q=ubuntu</link>\n" +
" <description>ubuntu search</description>\n" +
" <language>en-us</language>\n" +
" <atom:link href=\"http://torrentz.eu/feed_verifiedP?q=ubuntu\" rel=\"self\" type=\"application/rss+xml\" />\n" +
" <item>\n" +
" <title>ubuntu 11 10 desktop i386 iso</title>\n" +
" <link>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</link>\n" +
" <guid>http://torrentz.eu/8ac3731ad4b039c05393b5404afa6e7397810b41</guid>\n" +
" <pubDate>Thu, 13 Oct 2011 15:02:06 +0000</pubDate>\n" +
" <category>apps linux applications os software</category>\n" +
" <description>Size: 695 MB Seeds: 4,613 Peers: 161 Hash: 8ac3731ad4b039c05393b5404afa6e7397810b41</description>\n" +
" </item>\n" +
" </channel>\n" +
"</rss>\n";
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
ContentHandler torrentsHandler = new DefaultHandler() {
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
System.out.printf("%s / %s / %s\n", uri, localName, qName);
}
};
xr.setContentHandler(torrentsHandler);
xr.parse(new InputSource(new StringReader(document)));
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.