简体   繁体   中英

Parsing HTML string in WP7

I need to parse an HTML string that I receive from a server.

 <html>
 <head/>
 <body style="margin: 0;padding: 0">
    <a href="http://itunes.apple.com/WebObjects/MZStore.woa   
/wa/viewSoftware?id=319737742&amp;mt=8&amp;uo=6" style="margin: 0;padding: 0"><img   
src="https://s3.amazonaws.com/sportschatter/postcard.jpg" style="margin: 0;padding: 
0"/></a>
</body>
</html>

This is the response I get from the server. I need to retrieve the img URL https://s3.amazonaws.com/sportschatter/postcard.jpg as well as the href part. I have HTML Agility pack for WP7, but I don't know how to write the query to get this information. I tried something like this:

HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
         document.LoadHtml(htmlString);


       var value  =  document.DocumentNode.Descendants("img src").
                                       Select(
                                           x =>
                                           x.InnerText);

This does not give me any value. I also tried Regex :

    string parseString = htmlstring;
        Regex expression = new Regex(@".*img src=(\d+).*$");
        Match match = expression.Match(parseString);
        MessageBox.Show(match.Groups[1].Value); 

but this does not work either. Please let me know what I am doing wrong.

You clearly misunderstood how you're meant to use the LINQ2XML syntax (without XPath, since XPath isn't supported on Windows Phone)

You need to do something like this instead:

var image = document.DocumentNode.Descendants("img").First()
var source = image.GetAttribute("src", "").Value;

Use HtmlAgilityPack - do not use regex.

The 'query string' inside Descendants is an XPath, not CSS-like selector.

Here's an example: http://htmlagilitypack.codeplex.com/wikipage?title=Examples Here's some info about XPath: http://msdn.microsoft.com/en-us/library/ms256086.aspx

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM