简体   繁体   English

在C#中解析HTML,不断更新

[英]Parsing HTML in C# that is updating constantly

I have a webpage that is displaying some data using AJAX queries. 我有一个使用AJAX查询显示一些数据的网页。 I would need to parse some of this data in a C# program. 我需要在C#程序中解析一些这些数据。

Problem is that when I look at the source code of my webpage, this is not showing up the data, as this is being generated automatically by an AJAX script and modifying the DOM. 问题是,当我查看我的网页的源代码时,这并没有显示数据,因为这是由AJAX脚本自动生成并修改DOM。

If I select everything on the webpage and do "Inspect Element" with Chrome, I have the full HTML code with the data I want to extract that are in various tables. 如果我选择网页上的所有内容并使用Chrome进行“检查元素”,我会获得完整的HTML代码,其中包含我要提取的数据,这些数据位于各种表格中。

What I've tried is doing a webBrowser1.Navigate("www.site.com") , and then in my webBrowser1_DocumentCompleted() event, I'm doing this: 我试过做的是webBrowser1.Navigate("www.site.com") ,然后在我的webBrowser1_DocumentCompleted()事件中,我这样做:

var name = webBrowser1.Document.GetElementById("table_1_r_7_c_2");

Problem is that webBrowser1 is not returning the full HTML code, as some code is generated by the AJAX queries. 问题是webBrowser1没有返回完整的HTML代码,因为一些代码是由AJAX查询生成的。

Does anyone know how I could achieve this behavior in C#? 有谁知道如何在C#中实现这种行为?

The DocumentCompleted event is a bit misleading because it will also fire for each AJAX request on the page. DocumentCompleted事件有点误导,因为它也会触发页面上的每个AJAX请求。 You can do something like this to check if it's the actual page that's loaded, or some other variant to look for specific requests. 您可以执行以下操作来检查是否是加载的实际页面,或者查找特定请求的其他变体。

  private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
       if (e.Url.AbsolutePath == webBrowser1.Url.AbsolutePath)
       {
          // page loaded
       }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM