繁体   English   中英

Linq to XML查询XDoc

[英]Linq to XML querying an XDoc

我希望能够从span类访问文本,同时还能够获取span类之外的信息。 例如:

这是我的XML文档信息样本:

    <item><title>Operations Applications - MDI Diagnostics</title><link>http://confidential-link.com</link>
<description><![CDATA[<div style="margin-top:5px"><link rel="stylesheet" type="text/css" href="confidential" />
<span class="srch-Icon"><a href="confidential" title="Operations Applications - MDI Diagnostics">
<img src="confidential" alt="Web Page" border="0" /></a></span>
<span class="psrch-Description"> THE INFORMATION I WANT</span>
<p class="srch-Metadata"><span class="srch-URL">
<a href="confidential" title="Operations Applications - MDI Diagnostics">confidential link</a>
        -
        66KB
          </span></p></div>]]></description>
<author>Bob Smith
</author><pubDate>Mon, 10 Mar 2014 18:53:49 GMT</pubDate><search:dotfileextension>.ASPX</search:dotfileextension><search:size>68076</search:size>
<search:hithighlightedsummary> SIMILAR TO THE INFORMATION I WANT, COULD BE OPTION 2  </search:hithighlightedsummary>
</item>

这是我现在所拥有的:

                var feeds = from feed in xdoc.Descendants("item")
                        select new RSSS
                        {
                            Site = "TOPS",
                            URL = url, 
                            Title = feed.Element("title").Value,
                            Link = feed.Element("link").Value,
                            Description = feed.Element("description").Value

                        };

它将按预期返回“描述”:

<div style="margin-top:5px"><link rel="stylesheet" type="text/css" href="confidential" />
<span class="srch-Icon"><a href="confidential" title="Operations Applications - MDI Diagnostics">
<img src="confidential" alt="Web Page" border="0" /></a></span>
<span class="psrch-Description"> THE INFORMATION I WANT</span>
<p class="srch-Metadata"><span class="srch-URL">
<a href="confidential" title="Operations Applications - MDI Diagnostics">confidential link</a>
        -
        66KB
          </span></p></div>

那么,如何在仍能访问链接和标题等的同时,专门访问“ span class = psrch-Description”之间的信息呢?

**因为我不是在寻找类似的东西

var feeds = from feed in xDoc.Descendants("Show") where (string)feed.Attribute("Code") == "456" select new { EventDate = feed.Attribute("Date").Value };

这不允许我获取其他信息。

如果<Description>包含符合XML的HTML,则可以将其加载到另一个XDocument变量:

XDocument description = XDocument.Parse(feed.Element("description").Value);

然后,您可以使用另一个LINQ-to-XML查询来获取您感兴趣的<Description>值的任何部分。

否则,您将需要其他可以处理HTML的类型,例如HtmlAgilityPackHtmlDocument

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM