简体   繁体   English

Selebiun Crawler超时问题C#

[英]Selebiun Crawler Timeout Issues C#

The following code I am running and during execution I receive an error. 我正在运行以下代码,并且在执行过程中收到错误消息。

  public void GetCategoriesSelenium() {
            string javascript = System.IO.File.ReadAllText(@"GetCategory.js");
            CrawlerWebSeleniumJS.ExecuteScript("var finished;");
            CrawlerWebSeleniumJS.ExecuteScript("var all_categories;");
            CrawlerWebSeleniumJS.ExecuteScript("finished = false;");
            CrawlerWebSeleniumJS.ExecuteScript("all_categories = [];");

            CrawlerWebSelenium.Manage().Timeouts().SetScriptTimeout(TimeSpan.FromDays(1));
            CrawlerWebSelenium.Manage().Timeouts().SetPageLoadTimeout(TimeSpan.FromDays(1));
            CrawlerWebSelenium.Manage().Timeouts().ImplicitlyWait(TimeSpan.FromDays(1));

            AddToConsole("CRAWLER: GET - Categories");

            try {
                CrawlerWebSeleniumJS.ExecuteScript(javascript);
                }
            catch {
                }

            int ready = 2;

            for (int i = 0; i < ready; i++) {
                try {
                    if (CrawlerWebSeleniumJS.ExecuteScript("return finished").ToString() == "True") {
                        i = i++ + ready++;
                        }
                    else {
                        ready++;
                        }
                    }
                catch {

                    }
                }
            AddToCatsTreeSelenium();
            }

$('.p-pstctgry-lnk-ctgry').each(function (i) {
    var idBits = this.id.split('_');
    var theId = idBits[1];
    var theTitle = this.text;
    var subcategories = [];
    //initiate ajax request for json results
    $.ajax({
        async: false,
        type: 'GET',
        dataType: 'json',
        url: 'URL REMOVED',
        data: {
            nodeType: 'cat',
            level1id: theId
        }
    }).done(function (theJSON1) {
        var thelength1 = Object.keys(theJSON1['items']).length;
        //loop through found subs
        for (var i = 0; i < thelength1; i++) {
            //start of next recursive block to copy and paste inside
            var subsubcategories = [];
            //initiate ajax request for sub json results
            $.ajax({
                async: false,
                type: 'GET',
                dataType: 'json',
                url: 'URL REMOVED',
                data: {
                    nodeType: 'cat',
                    level1id: theId,
                    level2id: theJSON1['items'][i]['id']
                }
            }).done(function (theJSON2) {
                var thelength2 = Object.keys(theJSON2['items']).length;
                for (var k = 0; k < thelength2; k++) {
                    //start of next recursive block to copy and paste inside
                    var subsubsubcategories = [];
                    //initiate ajax request for sub json results
                    if ((theJSON2['items'][k]['id'] != 'OFFER') && (theJSON2['items'][k]['id'] != 'WANTED')) {
                        $.ajax({
                            async: false,
                            type: 'GET',
                            dataType: 'json',
                            url: 'URL REMOVED',
                            data: {
                                nodeType: 'cat',
                                level1id: theId,
                                level2id: theJSON1['items'][i]['id'],
                                level3id: theJSON2['items'][k]['id']
                            }
                        }).done(function (theJSON3) {
                            var thelength3 = Object.keys(theJSON3['items']).length;
                            for (var l = 0; l < thelength3; l++) {
                                console.log('---' + theJSON3['items'][l]['value'] + ' ' + theJSON3['items'][l]['id']);
                                //store this subsub
                                subsubsubcategories.push({
                                    title: theJSON3['items'][l]['value'],
                                    id: theJSON3['items'][l]['id'],
                                    sub: ''
                                });
                            }
                            //end done theJSON
                        });
                    }
                    //end of next recursive block to copy and paste inside
                    console.log('--' + theJSON2['items'][k]['value'] + ' ' + theJSON2['items'][k]['id']);
                    //store this subsub
                    subsubcategories.push({
                        title: theJSON2['items'][k]['value'],
                        id: theJSON2['items'][k]['id'],
                        sub: subsubsubcategories
                    });
                }
                //end done theJSON
            });
            console.log('-' + theJSON1['items'][i]['value'] + ' ' + theJSON1['items'][i]['id']);
            //store this sub with -> subsub
            subcategories.push({
                title: theJSON1['items'][i]['value'],
                id: theJSON1['items'][i]['id'],
                sub: subsubcategories
            });
            //end of next recursive block to copy and paste inside

            //end sub loop
        }
        console.log('' + theTitle + ' ' + theId);
        //store this cat with -> sub -> subsub
        all_categories.push({
            title: theTitle,
            id: theId,
            sub: subcategories
        });
        console.log(all_categories);
        //end first json subcat loop
    });
    //end main cat scan loop
});
finished = true;

The above code is the method that I run and the code that is under it is pure javascript that is being run through selenium. 上面的代码是我运行的方法,它下面的代码是通过selenium运行的纯JavaScript。

So issue one, when the code is run selenium locks up. 因此,发出第一个问题,当代码运行时,硒会锁定。 Which I can understand. 我能理解的。 This process takes about 4mins. 此过程大约需要4分钟。 After 60secs it times out with the error 60秒后,它因错误而超时

The HTTP request to the remote WebDriver server for URL timed out after 60 seconds. 60秒后,到远程WebDriver服务器的URL的HTTP请求超时。

Which is really annoying and locks the system up. 这确实很烦人,并且将系统锁定。 I know a really quick and easy way to get this fixed. 我知道一种解决此问题的快速简便的方法。 (Thread.Sleep(300000) which is disgusting... (Thread.Sleep(300000)令人恶心...

My thoughts are, maybe it is running a javascript query and waiting for it to finish and I am constantly pounding Selenium with more javascript requests which time out as expected. 我的想法是,也许它正在运行一个javascript查询并等待其完成,并且我不断向Selenium投放更多的javascript请求,这些请求均按预期超时。

Any other thoughts? 还有其他想法吗?

The driver's constructor should have an overload that includes a TimeSpan indicating the timeout for the HTTP client used by the .NET bindings to communicate with the remote end. 驱动程序的构造函数应具有一个过载,其中包括一个TimeSpan ,该TimeSpan指示.NET绑定用于与远程端进行通信的HTTP客户端的超时。 Setting that to an appropriately large value should be sufficient to let the operation complete. 将其设置为适当的大值应该足以使操作完成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM