简体   繁体   English

消费来自WikiNews的数据

[英]Consuming data from WikiNews

I have been scouring the net but I can't seem to find any examples of consuming data from WikiNews . 我一直在网上搜索,但似乎找不到从WikiNews消费数据的任何示例。 They have an RSS feed with links to individual stories as HTML, but I would like to get the data in a structured format such as XML etc. 他们有一个RSS提要,其中包含指向单个故事的链接(例如HTML),但是我希望以结构化格式(例如XML等)获取数据。

By structured format I mean an XML file for each story that has a defined XML schema (XSD) file. 结构化格式是指每个具有定义的XML模式(XSD)文件的故事的XML文件。 See: [ http://www.w3schools.com/schema/schema_intro.asp][2] 参见:[ http://www.w3schools.com/schema/schema_intro.asp] [2 ]

Has anyone written a program that consumes stories from WikiNews? 是否有人编写了使用WikiNews故事的程序? Do they have a documented API? 他们有文档化的API吗?

I would like to use C# to collect selected stories and store them in SQL Server 2008. 我想使用C#收集选定的故事并将其存储在SQL Server 2008中。

[2]: By "structured format" I mean something like an XML schema (XSD) file. [2]:“结构化格式”是指类似XML模式(XSD)文件的内容。 See: http://www.w3schools.com/schema/schema_intro.asp 请参阅: http//www.w3schools.com/schema/schema_intro.asp

他们使用的软件具有API,但是我不确定WikiNews是否支持它。

Their feed: http://feeds.feedburner.com/WikinewsLatestNews 他们的供稿: http : //feeds.feedburner.com/WikinewsLatestNews

If you put that in your browser and read the source, you'll see that it is XML. 如果将其放在浏览器中并阅读源代码,则会看到它是XML。 The XML contains the title, description, a link, etc. Only the description is in HTML. XML包含标题,描述,链接等。只有描述是HTML格式。

Here is the beginning of the response: 这是响应的开始:

<rss xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0"> 
         <channel> 
             <title>Wikinews</title> 
             <description>Wikinews RSS feed</description> 
             <language>en</language> 
             <link>http://en.wikinews.org</link> 
             <copyright>Creative Commons Attribution 2.5 (unless otherwise noted)</copyright> 
             <generator>Wikinews Fetch</generator> 
             <ttl>180</ttl> 
             <docs>http://none</docs> 
                 <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/WikinewsLatestNews" /><feedburner:info uri="wikinewslatestnews" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><creativeCommons:license>http://creativecommons.org/licenses/by/2.5/</creativeCommons:license><feedburner:browserFriendly>This is an XML content feed. It is intended to be viewed in a newsreader or syndicated to another site.</feedburner:browserFriendly><item> 
                     <title>Lufthansa pilots begin strike</title> 
                     <link>http://feedproxy.google.com/~r/WikinewsLatestNews/~3/1K2xloPGlmI/Lufthansa_pilots_begin_strike</link> 
                     <description>&lt;p&gt;&lt;a href="http://en.wikinews.org/w/index.php?title=File:LocationGermany.png&amp;filetimestamp=20060604120306" class="image" title="A map showing the location of Germany"&gt;&lt;img alt="A map showing the location of Germany" src="http://upload.wikimedia.org/wikipedia/commons/thumb/d/de/LocationGermany.png/196px-LocationGermany.png" width="196" height="90" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;b class="published"&gt;&lt;span id="publishDate" class="value-title" title="2010-02-22"&gt;&lt;/span&gt;Monday, February 22, 2010&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The pilot's union of &lt;a href="http://en.wikinews.org/wiki/Germany" title="Germany" class="mw-redirect"&gt;German&lt;/a&gt; airline &lt;a href="http://en.wikipedia.org/wiki/Lufthansa" class="extiw" title="w:Lufthansa"&gt;Lufthansa&lt;/a&gt; have begun a four-day strike over pay and job security. Operations at subsidiary airlines &lt;a href="http://en.wikipedia.org/wiki/Lufthansa_Cargo" class="extiw" title="w:Lufthansa Cargo"&gt;Lufthansa Cargo&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Germanwings" class="extiw" title="w:Germanwings"&gt;Germanwings&lt;/a&gt; are also affected by the strike.&lt;/p&gt;
&lt;em&gt;&lt;a href='http://en.wikinews.org/wiki/Lufthansa_pilots_begin_strike'&gt;More...&lt;/a&gt;&lt;/em&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/WikinewsLatestNews?a=1K2xloPGlmI:9SJI0YV04-M:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/WikinewsLatestNews?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/WikinewsLatestNews?a=1K2xloPGlmI:9SJI0YV04-M:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/WikinewsLatestNews?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/WikinewsLatestNews?a=1K2xloPGlmI:9SJI0YV04-M:YwkR-u9nhCs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/WikinewsLatestNews?d=YwkR-u9nhCs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;</description> 
<guid isPermaLink="false">http://en.wikinews.org/wiki/Lufthansa_pilots_begin_strike</guid> 

<feedburner:origLink>http://en.wikinews.org/wiki/Lufthansa_pilots_begin_strike</feedburner:origLink></item>

Your question is really unclear! 您的问题确实不清楚! but I guess you want to format the feeds of WikiNews, to be readable in a more friendly way (as if you are reading it in WikiNews itself), am I correct? 但是我想您想格式化 WikiNews的提要,使其以更友好的方式可读(就像您在WikiNews本身中阅读它一样),对吗?

If so, then you have to know that RSS are XML with a standard format, and not related to WikiNews, and you can transform any RSS feeds to be displayed in -say- HTML with XSLT . 如果是这样,那么您必须知道RSS是具有标准格式的XML,并且与WikiNews无关,并且可以使用XSLT转换任何 RSS供稿以使其以HTML形式显示。

If you need to get the story itself, you can use the given link in the feed, and display it in a webbrowser control (if you are developing a windows application). 如果您需要获取故事本身,则可以使用提要中的给定链接,并将其显示在Web浏览器控件中(如果您正在开发Windows应用程序)。

Do you need something else other than what I have said? 除了我说的以外,您还需要其他吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM