简体   繁体   English

Perl,Parallel :: ForkManager - 如何实现fork的超时

[英]Perl, Parallel::ForkManager - how to implement timeout for fork

Is it possible to implement some kind of timeout (time limit) for fork using Parallel::ForkManager ? 是否可以使用Parallel :: ForkManager为fork实现某种超时(时间限制)?

Basic Parallel::ForkManager script looks like this Basic Parallel :: ForkManager脚本如下所示

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new( 10 );
for ( 1 .. 1000 ) {
    $pm->start and next;
    # some job for fork
    $pm->finish;
}
$pm->wait_all_children();

I would like to limit time for "# some job for fork". 我想限制“#for work for fork”的时间。 For example, if its not finished in 90 secs. 例如,如果它没有在90秒内完成。 then it (fork) should be killed/terminated. 那么它(fork)应该被杀死/终止。 I thought about using this but I have to say, that I dont know how to use it with Parallel::ForkManager. 我想过使用它,但我不得不说,我不知道如何在Parallel :: ForkManager中使用它。

EDIT 编辑

Thanks hobbs and ikegami. 感谢hobbs和ikegami。 Both your suggestions worked..... but only in this basic example, not in my actual script :(. 你的建议都有效.....但只是在这个基本的例子中,而不是在我的实际脚本中:(。 截图 These forks will be there forever and - to be honest - I dont know why. 这些叉子将永远存在 - 说实话 - 我不知道为什么。 I use this script for couple of months. 我使用这个脚本几个月。 Didnt change anything (although many things depends on outside variables). 没有改变任何东西(虽然许多事情取决于外部变量)。 Every fork has to download a page from a website, parse it and save results to a file. 每个fork都必须从网站下载页面,解析它并将结果保存到文件中。 It should not take more than 30 secs per fork. 每叉不应超过30秒。 Timeout is set to 180 secs. 超时设置为180秒。 Those hanging forks are totally random so its very hard to trace the problem. 那些悬挂叉是完全随机的,因此很难追踪问题。 Thats why I came up with a temporary, simple solution - timeout & kill. 这就是为什么我想出一个临时的,简单的解决方案 - 超时和杀死。

What could possibly disable (interrupt) your methods of timeout in my code ? 什么可能在我的代码中禁用(中断)你的超时方法? I dont have any other alarm() anywhere in my code. 我的代码中的任何地方都没有任何其他alarm()

EDIT 2 编辑2

One of the forks, was hanging for 1h38m and returned "timeout PID" - which is what I type in die() for alarm() . 其中一个分叉,悬挂1小时38分钟并返回“超时PID” - 这就是我在die()键入的alarm() So the timeout works... but its late about 1h36,5m ;). 所以超时工作...但它的晚期大约1小时36分钟;)。 Do you have any ideas? 你有什么想法?

Update 更新

Sorry to update after the close, but I'd be remiss if I didn't point out that Parallel::ForkManager also supports a run_on_start callback. 很抱歉在收盘后更新,但如果我没有指出Parallel :: ForkManager也支持run_on_start回调,那将是我的run_on_start This can be used to install a "child registration" function that takes care of the time() -stamping of PIDs for you. 这可用于安装“子注册”功能,该功能负责为您提供PID的time()

Eg, 例如,

$pm->run_on_start(sub { my $pid = shift; $workers{$pid} = time(); });

The upshot is that, in conjunction with run_on_wait as described below, the main loop of a P::FM doesn't have to do anything special. 结果是,与下面描述的run_on_wait一起,P :: FM的主循环不需要做任何特殊的事情。 That is, it can remain a simple $pm->start and next , and the callbacks will take care of everything else. 也就是说,它可以保持简单的$pm->start and next ,并且回调将处理其他所有事情。

Original Answer 原始答案

Parallel::ForkManager's run_on_wait handler, and a bit of bookkeeping, can force hanging and ALRM-proof children to terminate. Parallel :: ForkManager的run_on_wait处理程序和一些记账,可以强制挂起和ALRM证明的孩子终止。

The callback registered by that function can be run, periodically, while the $pm awaits child termination. 该函数注册的回调可以定期运行,而$pm等待终止子进程。

use strict; use warnings;
use Parallel::ForkManager;

use constant PATIENCE => 90; # seconds

our %workers;

sub dismiss_hung_workers {
  while (my ($pid, $started_at) = each %workers) {
    next unless time() - $started_at > PATIENCE;
    kill TERM => $pid;
    delete $workers{$pid};
  }
}

...

sub main {
  my $pm = Parallel::ForkManager->new(10);
  $pm->run_on_wait(\&dismiss_hung_workers, 1);  # 1 second between callback invocations

  for (1 .. 1000) {
    if (my $pid = $pm->start) {
      $workers{$pid} = time();
      next;
    }
    # Here we are child.  Do some work.
    # (Maybe install a $SIG{TERM} handler for graceful shutdown!)
    ...
    $pm->finish;
  }

  $pm->wait_all_children;

}

(As others suggest, it's better to have the children regulate themselves via alarm() , but that appears intermittently unworkable for you. You could also resort to wasteful, gross hacks like having each child itself fork() or exec('bash', '-c', 'sleep 90; kill -TERM $PPID') .) (正如其他人所说,最好让孩子通过alarm()来调节自己alarm() ,但这对你来说似乎是间歇性的。你也可以采取浪费,粗暴的行为,例如让每个孩子自己fork() or exec('bash', '-c', 'sleep 90; kill -TERM $PPID') 。)

All you need is one line: 你只需要一行:

use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new( 10 );
for ( 1 .. 1000 ) {
    $pm->start and next;
    alarm 90;             # <---
    # some job for fork
    $pm->finish;
}
$pm->wait_all_children();

You don't need to set up a signal handlers since you do mean for the process to die. 您不需要设置信号处理程序,因为您的意思是让进程死亡。

It even works if you exec in the child. 它即使你工作exec的孩子。 It won't work on Windows, but using fork on Windows is questionable in the first place. 它不适用于Windows,但首先在Windows上使用fork是有问题的。

在你的子进程中(即在$pm->start and next以及循环结束之间)做你所链接的答案。你需要做什么才能让它与Parallel :: ForkManager进行交互,其他而不是确保你不小心杀死父母:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM