简体   繁体   English

Java / Android中的RSS / XML feed的可靠解析

[英]Reliable Parsing for RSS/XML feed in Java/Android

I am using XMLPULLPARSER in android. 我在android中使用XMLPULLPARSER。 I am able to get all the information I want from multiple feeds. 我可以从多个提要中获取我想要的所有信息。 The issue is figuring out how to reliable get the descriptions parsed from the feeds. 问题是弄清楚如何可靠地从提要中解析描述。 The description tag from two of the feeds is as follows: 来自两个提要的描述标签如下:

` The "malnourished" singer says her mom is trying to help her eat better <img src="http://feeds.feedburner.com/~r/people/headlines/~4/of8YOoAtLaA" height="1" width="1"/>' ` “营养不良”的歌手说,她的妈妈正试图帮助她吃得更好。 <img src =“ http://feeds.feedburner.com/~r/people/headlines/~4/of8YOoAtLaA” height =“ 1” width = “ 1” />'

AND

'<img src="http://images.eonline.com/resize/66/66/eol_images/Entire_Site/2011515//300.garfield.lc.061511.jpg" height="66" width="66" border="0" alt="Andrew Gardield, Garrett Hedlund, Kate Mara" align="left" hspace="5" /> After the huge hullabaloo he caused by hitting the town with his On the Road costar Kristen Stewart, cutie-pie Garrett Hedlund apparently decided to keep a low profile in Hawaii with a less ...<br clear="all" /> <p><a href="http://feedads.g.doubleclick.net/~at/kAgHF8uSo-kBC708djx7vWq7S5Y/0/da"><img src="http://feedads.g.doubleclick.net/~at/kAgHF8uSo-kBC708djx7vWq7S5Y/0/di" border="0" ismap="true"></img></a><br/><a href="http://feedads.g.doubleclick.net/~at/kAgHF8uSo-kBC708djx7vWq7S5Y/1/da"><img src="http://feedads.g.doubleclick.net/~at/kAgHF8uSo-kBC708djx7vWq7S5Y/1/di" border="0" ismap="true"></img></a></p><div class="feedflare"><a href="http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC-V1kVA:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/eonl '<img src =“ http://images.eonline.com/resize/66/66/eol_images/Entire_Site/2011515//300.garfield.lc.061511.jpg” height =“ 66” width =“ 66”边框=“ 0” alt =“ Andrew Gardield,Garrett Hedlund,Kate Mara” align =“ left” hspace =“ 5” /> 在巨大的hullabaloo之后,他用自己的On the Road Costar Kristen Stewart击中了小镇,加勒特·赫德伦德(Garrett Hedlund)显然决定以较低的价格在夏威夷保持低调 ... <br clear="all" /> <p> <a href =“ http://feedads.g.doubleclick.net/~at/kAgHF8uSo -kBC708djx7vWq7S5Y / 0 / da“> <img src =” http://feedads.g.doubleclick.net/~at/kAgHF8uSo-kBC708djx7vWq7S5Y/0/di“ border =” 0“ ismap =” true“> </ img > </a> <br/> <a href="http://feedads.g.doubleclick.net/~at/kAgHF8uSo-kBC708djx7vWq7S5Y/1/da"> <img src =“ http://feedads.g .doubleclick.net /〜at / kAgHF8uSo-kBC708djx7vWq7S5Y / 1 / di“ border =” 0“ ismap =” true“> </ img> </a> </ p> <div class =” feedflare“> <a href =“ http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC-V1kVA:yIl2AUoC8zA”> <img src =“ http://feeds.feedburner.com/~ff/eonl ine/topstories?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC-V1kVA:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/eonline/topstories?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC-V1kVA:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/eonline/topstories?i=oSTZWu5LPBA:XlROC-V1kVA:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC-V1kVA:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/eonline/topstories?d=qj6IDK7rITs" border="0"></img></a></div><img src="http://feeds.feedburner.com/~r/eonline/topstories/~4/oSTZWu5LPBA" height="1" width="1"/>' ine / topstories?d = yIl2AUoC8zA“ border =” 0“> </ img> </a> <a href =” http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC- V1kVA:7Q72WNTAKBA“> <img src =” http://feeds.feedburner.com/~ff/eonline/topstories?d=7Q72WNTAKBA“ border =” 0“> </ img> </a> <a href =” http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA:XlROC-V1kVA:V_sGLiPBpWU"><img src =“ http://feeds.feedburner.com/~ff/eonline/topstories? i = oSTZWu5LPBA:XlROC-V1kVA:V_sGLiPBpWU“ border =” 0“> </ img> </a> <a href =” http://feeds.eonline.com/~ff/eonline/topstories?a=oSTZWu5LPBA: XlROC-V1kVA:qj6IDK7rITs“> <img src =” http://feeds.feedburner.com/~ff/eonline/topstories?d=qj6IDK7rITs“ border =” 0“> </ img> </a> </ div > <img src =“ http://feeds.feedburner.com/~r/eonline/topstories/~4/oSTZWu5LPBA” height =“ 1” width =“ 1” />'

As you can see both are quite different. 如您所见,两者是完全不同的。 I have made the information that I want is bolded. 我已将所需的信息加粗。 There is no CDATA in the description so that option is not viable. 说明中没有CDATA,因此该选项不可行。 I can parse the information I want out of each of them, but I would like an option that works for almost all situation. 我可以从每个信息中解析出我想要的信息,但是我想要一个适用于几乎所有情况的选项。 I am not sure that this is possible, but I have seen a number of RSS readers, podcast readers, that have managed to do third relatively successfully. 我不确定这是否可行,但是我已经看到许多RSS阅读器,播客阅读器,它们已经相对成功地获得了第三名。 Any suggestions? 有什么建议么?

That's html embedded in the description, to show it correctly you could use a webview . 说明中嵌入了html,为了正确显示它,您可以使用webview

If you really just want to get the text description, you should look at the solutions proposed here . 如果您真的只想获取文字说明,则应查看此处提出的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM