简体   繁体   English

使用 htmlagility 包抓取表格的行项目

[英]Scraping row items of a table using htmlagility pack

I am making program which will show the text from the table.The structure is like that.我正在制作程序,它将显示表格中的文本。结构就是这样。 there are two tables but I want to get text from the 2nd table.有两个表,但我想从第二个表中获取文本。在此处输入图片说明

my table data looks like below:我的表数据如下所示:

在此处输入图片说明

I want to show the first 3 columns of each row of the 2nd table.我想显示第二个表每行的前 3 列。 For this I tried like this.为此,我试过这样。

HtmlWeb web = new HtmlWeb();

HtmlAgilityPack.HtmlDocument doc = web.Load("http://www.banglaeye.com/baby-names/index.php");
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class='col_box']/table[2]/tr/td");

try
{
    foreach (HtmlNode n in nodes)
    {
        if (k != 0)
        {
            link = n.InnerHtml;
            my_link.Add(link);
            MessageBox.Show(link);               
        }
        k++;
    }
}
catch (NullReferenceException)
{
    MessageBox.Show("No link found");
}

I have found that this URL does a http post method.我发现这个 URL 做了一个 http post 方法。 But Html Agility pack doesn't provide any Http post method.但是 Html Agility 包不提供任何 Http post 方法。 so how can achieve my goal??那么如何才能实现我的目标?

The table whose contents you want to grab isn't populated until the user of the browser clicks the "Search" button.在浏览器用户单击“搜索”按钮之前,不会填充要抓取其内容的表格。 If someone were to navigate to that URL normally, they wouldn't see any entries in the table until the button is pressed.如果有人要正常导航到该 URL,则在按下按钮之前,他们不会看到表中的任何条目。 That is why HTMLAgilityPack only sees the first row.这就是 HTMLAgilityPack 只看到第一行的原因。 The button, when clicked, performs an HTTP POST:单击该按钮时,将执行 HTTP POST:

letter=All&gender_id=0&origin_id=0&submit=search

Your program must perform this request then load the results into an HtmlDocument using doc.LoadHtml() rather than doc.Load() .您的程序必须执行此请求,然后使用doc.LoadHtml()而不是doc.Load()将结果加载到HtmlDocument

Here are some other Stack Overflow questions you can refer to in order to complete those two tasks:为了完成这两项任务,您可以参考以下其他一些 Stack Overflow 问题:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM