Windows phone Web scraping

Question

I'm trying to scrape data from a webpage. By using HtmlAgility pack I can load a particular div that I want to display. But inside this div node there are other sub/child node. How can I extract the innerhtml of each subnode? Here's what I've done:

var webget = new HtmlWeb();
var doc = webget.Load("http://www.dmp.gov.bd/application/index/pressdetails/press_159");

HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='span8 inner_mess']");

Here I'm pointing a specific webpage. It won't be the same all time, but it's confirm that the div is same and inside that div there will be different sub nodes depending on the URL.

If I can somehow find out what are the sub nodes available in that particular div through code, I might then can sort out something.

Answer 1

Do you want to recursively trace the nodes? (I can't tell if this works because I only speak English). You can add indentations and carriage returns to pretty it up.

private void button1_Click(object sender, EventArgs e)
{
    var webget = new HtmlWeb();
    var doc = webget.Load("http://www.dmp.gov.bd/application/index/pressdetails/press_159");

    HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='span8 inner_mess']");

    TraverseNodes(node.ChildNodes);
}

private void TraverseNodes(HtmlNodeCollection nodes)
{
    foreach (HtmlNode node in nodes)
    {
        textBox1.Text += node.InnerText;

        TraverseNodes(node.ChildNodes);
    }
}

Windows phone Web scraping

Question

1 answers

solution1
3 ACCPTED 2013-12-14 05:25:23

Windows phone Web scraping

Question

1 answers

solution1 3 ACCPTED 2013-12-14 05:25:23

solution1
3 ACCPTED 2013-12-14 05:25:23