简体   繁体   English

如何使用Perl的WWW :: Mechanize从超时中恢复?

[英]How can I recover from a timeout with Perl's WWW::Mechanize?

I'm using WWW::Mechanize to read a particular webpage in a loop that runs every few seconds. 我正在使用WWW :: Mechanize来读取每隔几秒运行一次的循环中的特定网页。 Occasionally, the 'GET' times out and the script stops running. 偶尔,'GET'超时并且脚本停止运行。 How can I recover from one such timeout such that it continues the loop and tries the 'GET' the next time around? 如何从一个这样的超时恢复,以便它继续循环并在下一次尝试'GET'?

Use eval : 使用eval

eval {
    my $resp = $mech->get($url);
    $resp->is_success or die $resp->status_line;
    # your code
};

if ($@) {
    print "Recovered from a GET error\n";    
}

The eval block will catch any error while GETing the page. 在获取页面时, eval块将捕获任何错误。

One option would be to implement a method to handle timeout errors and hook it into the mech object at construction time as the onerror handler. 一种选择是实现一种方法来处理超时错误,并在构造时将其作为onerror处理程序挂钩到mech对象中。 See Constructor and Startup in the docs. 请参阅文档中的构造函数和启动

You could even ignore errors by setting a null error handler, for example: 您甚至可以通过设置空错误处理程序来忽略错误,例如:

my $mech = WWW::Mechanize->new( onerror => undef );

but I wouldn't recommend that - you'll just get weird problems later. 但我不建议 - 你以后会遇到奇怪的问题。

This solution will continue to attempt to load the page until it works. 此解决方案将继续尝试加载页面,直到它工作。

do {
    eval {
        $mech->get($url);
    };
} while ($@ ne '');

For a more complete solution, you can use a module like Try::Tiny::Retry. 要获得更完整的解决方案,您可以使用Try :: Tiny :: Retry等模块。 It allows you to specify a code block to run, catches any errors, then retries that code block a configurable amount of time. 它允许您指定要运行的代码块,捕获任何错误,然后重试该代码阻塞可配置的时间量。 The syntax is pretty clean. 语法很干净。

use WWW::Mechanize();
use Try::Tiny::Retry ':all';

my $mech = WWW::Mechanize->new();
retry {
    $mech->get("https://stackoverflow.com/");
}
on_retry {
    warn("Failed. Retrying. Error was: $_");
}
delay {
    # max of 100 tries, sleeping 5 seconds between each failure
    return if $_[0] >= 100;
    sleep(11 * 1000 * 1000);
}; #don't forget this semicolon

# dump all the links found on the page
print join "\n", map {$_->text } $mech->links;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM