简体   繁体   中英

Linq to XML querying an XDoc

I want to be able to access the text from a span class, while also able to get information outside of the span class. For example:

Here is my a sample of the XML Document information:

    <item><title>Operations Applications - MDI Diagnostics</title><link>http://confidential-link.com</link>
<description><![CDATA[<div style="margin-top:5px"><link rel="stylesheet" type="text/css" href="confidential" />
<span class="srch-Icon"><a href="confidential" title="Operations Applications - MDI Diagnostics">
<img src="confidential" alt="Web Page" border="0" /></a></span>
<span class="psrch-Description"> THE INFORMATION I WANT</span>
<p class="srch-Metadata"><span class="srch-URL">
<a href="confidential" title="Operations Applications - MDI Diagnostics">confidential link</a>
        -
        66KB
          </span></p></div>]]></description>
<author>Bob Smith
</author><pubDate>Mon, 10 Mar 2014 18:53:49 GMT</pubDate><search:dotfileextension>.ASPX</search:dotfileextension><search:size>68076</search:size>
<search:hithighlightedsummary> SIMILAR TO THE INFORMATION I WANT, COULD BE OPTION 2  </search:hithighlightedsummary>
</item>

Here is what I have now:

                var feeds = from feed in xdoc.Descendants("item")
                        select new RSSS
                        {
                            Site = "TOPS",
                            URL = url, 
                            Title = feed.Element("title").Value,
                            Link = feed.Element("link").Value,
                            Description = feed.Element("description").Value

                        };

Which returns the "Description" as expected:

<div style="margin-top:5px"><link rel="stylesheet" type="text/css" href="confidential" />
<span class="srch-Icon"><a href="confidential" title="Operations Applications - MDI Diagnostics">
<img src="confidential" alt="Web Page" border="0" /></a></span>
<span class="psrch-Description"> THE INFORMATION I WANT</span>
<p class="srch-Metadata"><span class="srch-URL">
<a href="confidential" title="Operations Applications - MDI Diagnostics">confidential link</a>
        -
        66KB
          </span></p></div>

So how do I access specifically the information between "span class = psrch-Description" while still able to access like the Link and Title and such?

**Because I am not looking for something like

var feeds = from feed in xDoc.Descendants("Show") where (string)feed.Attribute("Code") == "456" select new { EventDate = feed.Attribute("Date").Value };

this does not allow me to get the other information.

If <Description> contains HTML that conform to XML, you can load it to another XDocument variable :

XDocument description = XDocument.Parse(feed.Element("description").Value);

Then you can use another LINQ-to-XML query to get any part of <Description> value you are interested in.

Otherwise, you'll need another type that can handle HTML like HtmlAgilityPack 's HtmlDocument for example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM