简体   繁体   English

挂起线程,直到WebBrowser完成加载

[英]Suspend thread until WebBrowser has finished loading

I'm trying to navigate across a website and do some work on the pages programmatically using a WebBrowser control in a Windows Form. 我正在尝试浏览网站并使用Windows窗体中的WebBrowser控件以编程方式在页面上进行一些工作。 I found this while looking for a way to block my thread until the WebBrowser's DocumentCompleted event is triggered. 我在寻找阻止我的线程的方法时找到了这个 ,直到触发了WebBrowser的DocumentCompleted事件。 Given that, here's my current code: 鉴于此,这是我目前的代码:

public partial class Form1 : Form
{
    private AutoResetEvent autoResetEvent;

    public Form1()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {
        Thread workerThread = new Thread(new ThreadStart(this.DoWork));
        workerThread.SetApartmentState(ApartmentState.STA);
        workerThread.Start();
    }

    private void DoWork()
    {
        WebBrowser browser = new WebBrowser();
        browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
        browser.Navigate(login_page);
        autoResetEvent.WaitOne();
        // log in

        browser.Navigate(page_to_process);
        autoResetEvent.WaitOne();
        // process the page
    }

    private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        autoResetEvent.Set();
    }
}

The thread doesn't look necessary, but it will be when I expand this code to accept requests over the network (the thread will listen for connections, then process the requests). 该线程看起来并不必要,但是当我扩展此代码以通过网络接受请求时(线程将侦听连接,然后处理请求)。 Also, I can't just put the processing code inside the DocumentCompleted handler, since I have to navigate to several different pages and do different things on each one. 另外,我不能只将处理代码放在DocumentCompleted处理程序中,因为我必须导航到几个不同的页面并在每个页面上执行不同的操作。

Now, from what I understand, the reason this doesn't work is because the DocumentCompleted event uses the same thread that WaitOne() is being called in, so the event will not be fired until WaitOne() returns (never, in this case). 现在,根据我的理解,这不起作用的原因是因为DocumentCompleted事件使用的是调用WaitOne()的相同线程,因此在WaitOne()返回之前不会触发该事件(在这种情况下永远不会)。

What's interesting is that if I add a WebBrowser control to the form from the toolbox (drag-and-drop), then navigate using that, this code works perfectly (with no changes other than putting the call to Navigate inside a call to Invoke - see below). 有趣的是,如果我从工具箱中添加一个WebBrowser控件(拖放),然后使用它进行导航,这段代码完美无缺(除了在调用Invoke之前调用Navigate之外没有任何更改 - 见下文)。 But if I manually add a WebBrowser control to the Designer file, it doesn't work. 但是,如果我手动将WebBrowser控件添加到Designer文件,则它不起作用。 And I don't really want a visible WebBrowser on my form, I just want to report the results. 我真的不想在我的表单上看到可见的WebBrowser,我只想报告结果。

public delegate void NavigateDelegate(string address);
browser.Invoke(new NavigateDelegate(this.browser.Navigate), new string[] { login_page });

My question, then, is: What's the best way to suspend the thread until the browser's DocumentCompleted event fires? 那么我的问题是:在浏览器的DocumentCompleted事件触发之前,暂停线程的最佳方法是什么?

Chris, 克里斯,

I pass you here a possible implementation that solves the problem, but please give a look at the comments here under that I had to face and fix before everything worked as I was expecting. 我在这里通过了一个可以解决问题的可能实现,但是请看一下这里的评论,我必须面对并修复所有工作,因为我期待。 Here an example of a method doing some activities on a page in a webBrowser (note that the webBrowser is part of a Form in my case): 这是一个在webBrowser中对页面执行某些活动的方法的示例(请注意,webBrowser是我的表单中的表单的一部分):

    internal ActionResponse CheckMessages() //Action Response is a custom class of mine to store some data coming from pages
        {
        //go to messages
        HtmlDocument doc = WbLink.Document; //wbLink is a referring link to a webBrowser istance
        HtmlElement ele = doc.GetElementById("message_alert_box");
        if (ele == null)
            return new ActionResponse(false);

        object obj = ele.DomElement;
        System.Reflection.MethodInfo mi = obj.GetType().GetMethod("click");
        mi.Invoke(obj, new object[0]);

        semaphoreForDocCompletedEvent = WaitForDocumentCompleted();  //This is a simil-waitOne statement (1)
        if (!semaphoreForDocCompletedEvent)
            throw new Exception("sequencing of Document Completed events is failed.");

        //get the list
        doc = WbLink.Document;
        ele = doc.GetElementById("mailz");
        if (!ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000)) //This is a simil-waitOne statement (2)

            ele = doc.GetElementById("mailz");
        ele = doc.GetElementById("mailz");

        //this contains a tbody
        HtmlElement tbody = ele.FirstChild;

        //count how many elemetns are espionage reports, these elements are inline then counting double with their wrappers on top of them.
        int spioCases = 0;
        foreach (HtmlElement trs in tbody.Children)
        {
            if (trs.GetAttribute("id").ToLower().Contains("spio"))
                spioCases++;
        }

        int nMessages = tbody.Children.Count - 2 - spioCases;

        //create an array of messages to store data
        GameMessage[] archive = new GameMessage[nMessages];

        for (int counterOfOpenMessages = 0; counterOfOpenMessages < nMessages; counterOfOpenMessages++)
        {

            //open first element
            WbLink.ScriptErrorsSuppressed = true;
            ele = doc.GetElementById("mailz");
            //this contains a tbody
            tbody = ele.FirstChild;

            HtmlElement mess1 = tbody.Children[1];
            int idMess1 = int.Parse(mess1.GetAttribute("id").Substring(0, mess1.GetAttribute("id").Length - 2));
            //check if subsequent element is not a spio report, in case it is then the element has not to be opened.
            HtmlElement mess1Sibling = mess1.NextSibling;
            if (mess1Sibling.GetAttribute("id").ToLower().Contains("spio"))
            {
                //this is a wrapper for spio report
                ReadSpioEntry(archive, counterOfOpenMessages, mess1, mess1Sibling);
                //delete first in line
                DeleteFirstMessageItem(doc, ref ele, ref obj, ref mi, ref tbody);
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6); //This is a simil-waitOne statement (3)

            }
            else
            {
                //It' s anormal message
                OpenMessageEntry(ref obj, ref mi, tbody, idMess1); //This opens a modal dialog over the page, and it is not generating a DocumentCompleted Event in the webBrowser

                //actually opening a message generates 2 documetn completed events without any navigating event issued
                //Application.DoEvents();
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);

                //read element
                ReadMessageEntry(archive, counterOfOpenMessages);

                //close current message
                CloseMessageEntry(ref ele, ref obj, ref mi);  //this closes a modal dialog therefore is not generating a documentCompleted after!
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);
                //delete first in line
                DeleteFirstMessageItem(doc, ref ele, ref obj, ref mi, ref tbody); //this closes a modal dialog therefore is not generating a documentCompleted after!
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);
            }
        }
        return new ActionResponse(true, archive);
    }

In practice this method takes a page of a MMORPG and reads messages sent to the account by other players and stores them in the ActionResponse class via the method ReadMessageEntry. 实际上,此方法需要一个MMORPG页面,并读取其他玩家发送给该帐户的消息,并通过ReadMessageEntry方法将它们存储在ActionResponse类中。

Apart the implementation and the logics of the code that are really case dependant (and not useful for you) there are few interesting elements that may be nice to note for you case. 除了真正依赖于案例(并且对你没用)的代码的实现和逻辑之外,还有一些有趣的元素可能很适合你。 I put some comments in the code and highlighted 3 important points [with symbols (1) , (2) and (3) ] 我在代码中加了一些注释,并突出了3个重点[带符号(1)(2)(3) ]

The algo is: 算法是:

1) Arrive to a page 1)到达页面

2) get the underlying Document from the webBrowser 2)从webBrowser获取基础文档

3) find a element to click to get to the messages page [done with : HtmlElement ele = doc.GetElementById("message_alert_box"); 3)找到要点击的元素到达消息页面[完成: HtmlElement ele = doc.GetElementById("message_alert_box"); ] ]

4) Trigger the event of clicking on it via the MethodInfo instance and the reflection-wise call [ this calls another page so a DocumentCompleted will be arriving sooner or later] 4)通过MethodInfo实例和反射式调用触发单击它的事件[这会调用另一个页面,因此DocumentCompleted迟早会到达]

5) Wait for the document completed to be called and then proceed [done with: semaphoreForDocCompletedEvent = WaitForDocumentCompleted(); 5)等待完成的文档被调用,然后继续[完成: semaphoreForDocCompletedEvent = WaitForDocumentCompleted(); at point (1)] 在第(1)点]

6) Fetch the new Document from the webBrowser after the page is changed 6)页面更改后从webBrowser中获取新文档

7) FInd a particular anchor on the page that is defining where the message I want to read are 7)在页面上找到一个特定的锚点,该锚点定义了我想要读取的消息的位置

8) Be sure that such TAG is present in the page (as there might be some AJAX delaying what I want to read to be ready) [done with: ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000) that is point (2)] 8)确保页面中存在这样的TAG(因为可能有一些AJAX延迟了我想要读取的内容)[完成: ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000)点(2)]

9) Do the whole loop for reading each message, which implies to open a modal dialog form that is on the same page therefore not generating a DocumentCompleted, read it when ready, then close it, and reloop. 9)读取每条消息的整个循环,这意味着打开一个位于同一页面上的模态对话框,因此不生成DocumentCompleted,准备好后读取它,然后关闭它,然后重新循环。 For this particular case I use an overload of (1) called semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6); 对于这种特殊情况,我使用了一个名为semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);的重载(1 semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6); at point (3) 在点(3)

Now the three methods I use to pause, check and read: 现在我使用三种方法来暂停,检查和阅读:

(1) To stop while DocumentCompleted is raised without overcharging DocumentCompleted method that may be used for more than one single purpose (as in your case) (1)在DocumentCompleted被提出时停止而不会多次收取可能用于多个单一目的的DocumentCompleted方法(如您的情况)

private bool WaitForDocumentCompleted()
        {
            Thread.SpinWait(1000);  //This is dirty but working
            while (Program.BrowsingSystem.IsBusy) //BrowsingSystem is another link to Browser that is made public in my Form and IsBusy is just a bool put to TRUE when Navigating event is raised and but to False when the DocumentCOmpleted is fired.
            {
                Application.DoEvents();
                Thread.SpinWait(1000);
            }

            if (Program.BrowsingSystem.IsInfoAvailable)  //IsInfoAvailable is just a get property to cover webBroweser.Document inside a lock statement to protect from concurrent accesses.
            {
                return true;
            }
            else
                return false;
        }

(2) Wait for a particular tag to be available in the page: (2)等待页面中的特定标签可用:

public static bool WaitForAvailability(this HtmlElement tag, string id, HtmlDocument documentToExtractFrom, long maxCycles)
        {
            bool cond = true;
            long counter = 0;
            while (cond)
            {
                Application.DoEvents(); //VERIFY trovare un modo per rimuovere questa porcheria
                tag = documentToExtractFrom.GetElementById(id);
                if (tag != null)
                    cond = false;
                Thread.Yield();
                Thread.SpinWait(100000);
                counter++;
                if (counter > maxCycles)
                    return false;
            }
            return true;
        }

(3) The dirty trick to wait for a DocumentCompleted that will ever arrive because no frames need reload on the page! (3)等待DocumentCompleted的脏技巧,因为没有帧需要在页面上重新加载!

private bool WaitForDocumentCompleted(int seconds)
    {
        int counter = 0;
        while (Program.BrowsingSystem.IsBusy)
        {
            Application.DoEvents();
            Thread.Sleep(1000);
            if (counter == seconds)
            {
            return true;
            }
            counter++;
        }
        return true;
    }

I pass you also the DocumentCompleted Methods and Navigating to give you the whole picture on how I used them. 我还通过了DocumentCompleted Methods和Navigating来了解我如何使用它们的全貌。

private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            if (Program.BrowsingSystem.BrowserLink.ReadyState == WebBrowserReadyState.Complete)
            {
                lock (Program.BrowsingSystem.BrowserLocker)
                {
                    Program.BrowsingSystem.ActualPosition = Program.BrowsingSystem.UpdatePosition(Program.BrowsingSystem.Document);
                    Program.BrowsingSystem.CheckContentAvailability();
                    Program.BrowsingSystem.IsBusy = false;
                }
            }
        }

private void webBrowser_Navigating(object sender, WebBrowserNavigatingEventArgs e)
        {
            lock (Program.BrowsingSystem.BrowserLocker)
            {
                Program.BrowsingSystem.ActualPosition.PageName = OgamePages.OnChange;
                Program.BrowsingSystem.IsBusy = true;
            }
        }

Please give a look here to know the mess behind DoEvents() if you're now aware about the details that lie behind the implementation presented here (hope it is not a problem to link other sites from S.Overflow). 如果你现在知道这里介绍的实现背后的细节(请希望从S.Overflow链接其他网站不是一个问题),请在这里查看 DoEvents()背后的混乱。

A small final note on the fact that you need to put the call to your Navigate method inside an Invoke when you use it from a Form instance: this is crystal clear you need an Invoke because the methods that need to work on the webBrowser (or even having it into scope as a refereed variable) need to be launched on the same Thread of the webBrowser itself! 最后一点注意事项,当你从Form实例中使用它时,你需要在Invoke中调用你的Navigate方法:这很清楚你需要一个Invoke因为需要在webBrowser上工作的方法(或者甚至将它作为一个引用变量作为范围)需要在webBrowser本身的同一个线程上启动!

Moreover if the WB is a child of some kind of Form container, it also needs that the thread from where it is instantiated is the same of the Form creation, and for transitivity all the methods that need to work on the WB need to be called on the Form thread (in you case the invoke relocates your calls on the Form native thread). 此外,如果WB是某种Form容器的子类,它还需要实例化它的线程与Form创建相同,并且为了传递性,需要调用需要在WB上工作的所有方法在Form线程上(在你的情况下,调用在Form本机线程上重定位你的调用)。 I hope this is useful for you (I just left a //VERIFY comment in the code in my native language to let you know what I think about Application.DoEvents()). 我希望这对你有用(我只是在我的母语代码中留下了// VERIFY注释,让你知道我对Application.DoEvents()的看法)。

Kind regards, Alex 亲切的问候,Alex

HAH! 哈! I had the same question. 我有同样的问题。 You can do this with event handling. 您可以通过事件处理来完成此操作。 If you stop a thread mid way through the page, it will need to wait until it the page finishes. 如果你在页面中途停止一个线程,它将需要等到页面完成。 You can easily do this by attaching 您可以通过附加轻松完成此操作

 Page.LoadComplete += new EventHandler(triggerFunction);

In the triggerFunction you can do this 在triggerFunction中,您可以执行此操作

triggerFunction(object sender, EventArgs e)
{
     autoResetEvent.reset();
}

Let me know if this works. 让我知道这个是否奏效。 I ended up not using threads in mine and instead just putting the stuff into triggerFunction. 我最终没有使用我的线程,而只是把东西放入triggerFunction。 Some syntax might not be 100% correct because I am answering off the top of my head 一些语法可能不是100%正确,因为我正在回答我的问题

EDIT 编辑

register in Initialize component method like this, instead of in the same method. 像这样注册Initialize组件方法,而不是在同一个方法中。

WebBrowser browser = new WebBrowser(); 
WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);

ReadyState will tell you the progress of the document loading when checked in the DocumentCompleted event. 在DocumentCompleted事件中检查时,ReadyState将告诉您文档加载的进度。

void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
   if (browser.ReadyState == WebBrowserReadyState.Complete)
{

}
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM