Parsing HTML string in WP7

Question

I need to parse an HTML string that I receive from a server.

 <html>
 <head/>
 <body style="margin: 0;padding: 0">
    <a href="http://itunes.apple.com/WebObjects/MZStore.woa   
/wa/viewSoftware?id=319737742&amp;mt=8&amp;uo=6" style="margin: 0;padding: 0"><img   
src="https://s3.amazonaws.com/sportschatter/postcard.jpg" style="margin: 0;padding: 
0"/></a>
</body>
</html>

This is the response I get from the server. I need to retrieve the img URL https://s3.amazonaws.com/sportschatter/postcard.jpg as well as the href part. I have HTML Agility pack for WP7, but I don't know how to write the query to get this information. I tried something like this:

HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
         document.LoadHtml(htmlString);


       var value  =  document.DocumentNode.Descendants("img src").
                                       Select(
                                           x =>
                                           x.InnerText);

This does not give me any value. I also tried Regex :

    string parseString = htmlstring;
        Regex expression = new Regex(@".*img src=(\d+).*$");
        Match match = expression.Match(parseString);
        MessageBox.Show(match.Groups[1].Value);

but this does not work either. Please let me know what I am doing wrong.

Answer 1

You clearly misunderstood how you're meant to use the LINQ2XML syntax (without XPath, since XPath isn't supported on Windows Phone)

You need to do something like this instead:

var image = document.DocumentNode.Descendants("img").First()
var source = image.GetAttribute("src", "").Value;

Answer 2

Use HtmlAgilityPack - do not use regex.

The 'query string' inside Descendants is an XPath, not CSS-like selector.

Here's an example: http://htmlagilitypack.codeplex.com/wikipage?title=Examples Here's some info about XPath: http://msdn.microsoft.com/en-us/library/ms256086.aspx

Parsing HTML string in WP7

Question

2 answers

solution1
2 ACCPTED 2011-11-08 12:19:26

solution2
-1 2011-11-08 11:16:11

Parsing HTML string in WP7

Question

2 answers

solution1 2 ACCPTED 2011-11-08 12:19:26

solution2 -1 2011-11-08 11:16:11

solution1
2 ACCPTED 2011-11-08 12:19:26

solution2
-1 2011-11-08 11:16:11