[英]Unable to parse few xml nodes.What's the protection applied?
我有這樣的XML提要
<item><title>Left hopes BJP surge will eat into Mamata’s votes </title><link>http://timesofindia.feedsportal.com/c/33039/f/533916/s/39439a29/sc/7/l/0Ltimesofindia0Bindiatimes0N0Cindia0CLeft0Ehopes0EBJP0Esurge0Ewill0Eeat0Einto0EMamatas0Evotes0Chome0Clok0Csabha0Celections0C20A140Cnews0CLeft0Ehopes0EBJP0Esurge0Ewill0Eeat0Einto0EMamatas0Evotes0Carticleshow0C336252890Bcms/story01.htm</link><description>At times sworn enemies can be of help to each other, albeit indirectly. In the current political winds of West Bengal, no one knows it better than the Left.<img width='1' height='1' src='http://timesofindia.feedsportal.com/c/33039/f/533916/s/39439a29/sc/7/mf.gif' border='0'/><br clear='all'/><br/><br/><a href="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/rc/1/rc.htm" rel="nofollow"><img src="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/rc/1/rc.img" border="0"/></a><br/><a href="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/rc/2/rc.htm" rel="nofollow"><img src="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/rc/2/rc.img" border="0"/></a><br/><a href="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/rc/3/rc.htm" rel="nofollow"><img src="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/rc/3/rc.img" border="0"/></a><br/><br/><a href="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/a2.htm"><img src="http://da.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/a2.img" border="0"/></a><img width="1" height="1" src="http://pi.feedsportal.com/r/194480044196/u/409/f/533916/c/33039/s/39439a29/sc/7/a2t.img" border="0"/></description><pubDate>Fri, 11 Apr 2014 19:26:07 GMT</pubDate><guid isPermaLink="false">http://timesofindia.indiatimes.com/india/Left-hopes-BJP-surge-will-eat-into-Mamatas-votes/home/lok/sabha/elections/2014/news/Left-hopes-BJP-surge-will-eat-into-Mamatas-votes/articleshow/33625289.cms</guid></item>
我正在使用Jaunt API抓取新聞標題和此供稿中的鏈接。
agent.visit("http://timesofindia.feedsportal.com/c/33039/f/533916/index.rss");
Elements items=agent.doc.findEach("<item>");
for(Element item:items)
{
headline=item.findFirst("<title>").getText();
link=item.findFirst("<link>").getText();
System.out.println("headline:"+headline+"\nlink:"+link+"\n");
}
現在我獲得了所有的頭條新聞,但鏈接為空!!!!當我刮另外一個報紙訂閱源時,發生了同樣的事情。那個鏈接節點是否有任何特殊的東西(編碼)給出了null或我做錯了什么。
我不確定,但是findFirst
可能不處理<link>
因為findFirst
更面向注釋。 帶有適當查詢的getFirst
是否可行?
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.