简体   繁体   English

如何在Catalyst App中管理长时间运行的进程?

[英]How to manage a long running process in a Catalyst App?

This is my first Catalyst app and I'm not sure how to solve the following problem. 这是我的第一个Catalyst应用程序,我不知道如何解决以下问题。

The user enters some data in a form and selects a file (up to 100MB) for uploading. 用户在表单中输入一些数据并选择一个文件(最多100MB)进行上传。 After submitting the form, the actual computation takes up to 5 minutes and the results are stored in a DB. 提交表单后,实际计算最多需要5分钟,结果存储在数据库中。

What I want to do is to run this process (and maybe also the file upload) in the background to avoid a server timeout. 我想要做的是在后台运行此过程(也可能是文件上载)以避免服务器超时。 There should be some kind of feedback to the user (like a message "Job has been started" or a progress bar). 应该向用户提供某种反馈(例如消息“已启动作业”或进度条)。 The form should be blocked while the job is still running. 作业仍在运行时应该阻止表单。 A result page should be displayed once the job finished. 作业完成后,应显示结果页面。

In hours of reading I stumbled upon concepts like asynchronous requests, job queues, daemons, Gearman , or Catalyst::Plugin::RunAfterRequest . 在几个小时的阅读中,我偶然发现了异步请求,作业队列,守护进程, GearmanCatalyst :: Plugin :: RunAfterRequest等概念

How would you do it? 你会怎么做? Thanks for helping a web dev novice! 感谢您帮助网络开发新手!

PS: In my current local app the work is done in parallel with Parallel::ForkManager . PS:在我当前的本地应用程序中,工作与Parallel :: ForkManager并行完成 For the real app, would it be advisable to use a cloud computing service like Amazon EC2? 对于真正的应用程序,建议使用像Amazon EC2这样的云计算服务吗? Or just find a hoster who offers multi-core servers? 或者只是找一个提供多核服务器的主机?

Put the job in a queue and do it in a different process, outside of the Web application. 将作业放入队列中,并在Web应用程序之外的其他进程中执行此操作。 While you Catalyst process is busy, even if using Catalyst::Plugin::RunAfterRequest, it cannot be used to process other web requests. 当您的Catalyst进程繁忙时,即使使用Catalyst :: Plugin :: RunAfterRequest,它也不能用于处理其他Web请求。

There are very simple queuing systems, like File::Queue . 有非常简单的排队系统,比如File :: Queue Basically, you assign a job ID to the document, put it in the queue. 基本上,您将作业ID分配给文档,并将其放入队列中。 Another process checks the queue and picks up new jobs. 另一个进程检查队列并获取新作业。

You can save the job status in a database, or anything accessible any the web applications. 您可以将作业状态保存在数据库中,也可以将任何Web应用程序中的任何内容保存。 On the front end, you can poll the job status every X seconds or minutes to give feedback to the user. 在前端,您可以每X秒或几分钟轮询一次作业状态,以便向用户提供反馈。

You have to figure out how much memory and CPU you need. 你必须弄清楚你需要多少内存和CPU。 Multi-core CPU or multiple CPUs may not be required, even if you have several processes running. 即使您有多个进程在运行,也可能不需要多核CPU或多个CPU。 Choosing between a dedicated server or cloud like EC2 is more about the flexibility (resizing, snapshot, etc.) vs. price. 在EC2之类的专用服务器或云之间进行选择更多的是关于灵活性(调整大小,快照等)与价格的关系。

Somehow I couldn't get the idea of File::Queue. 不知怎的,我无法理解File :: Queue。 For non-blocking parallel execution, I ended up using a combination of TheSchwartz and Parallel::Prefork like it is implemented in the Foorum Catalyst App . 对于非阻塞并行执行,我最终使用了TheSchwartz和Parallel :: Prefork的组合,就像它在Foorum Catalyst应用程序中实现一样。 Basically, there are 5 important elements. 基本上,有5个重要元素。 Maybe this summary will be helpful to others. 也许这个总结对其他人有帮助。

1) TheSchwartz DB 1) The Schwartz DB

2) A client (DB handle) for the TheSchwartz DB 2)TheSchwartz DB的客户端(DB句柄)

package MyApp::TheSchwartz::Client;

use TheSchwartz;    
sub theschwartz {
    my $theschwartz = TheSchwartz->new(
        databases => [ {
            dsn  => 'dbi:mysql:theschwartz',
            user => 'user',
            pass => 'pass',
        } ],
        verbose => 1,
    );
    return $theschwartz;
}

3) A job worker (where the actual work is done) 3)工作者(实际工作完成的地方)

package MyApp::TheSchwartz::Worker::Test;

use base qw( TheSchwartz::Moosified::Worker );  
use MyApp::Model::DB;      # Catalyst DB connect_info
use MyApp::Schema;         # Catalyst DB schema   

sub work {
    my $class = shift;
    my $job = shift;    
    my ($args) = $job->arg;
    my ($arg1, $arg2) = @$args;

    # re-use Catalyst DB schema    
    my $connect_info = MyApp::Model::DB->config->{connect_info};
    my $schema = MyApp::Schema->connect($connect_info);

    # do the heavy lifting

    $job->completed();
}

4) A worker process TheSchwartzWorker.pl that monitors the table job non-stop 4)工作进程TheSchwartzWorker.pl地监视表作业

use MyApp::TheSchwartz::Client qw/theschwartz/;    # db connection
use MyApp::TheSchwartz::Worker::Test;
use Parallel::Prefork;

my $client = theschwartz();

my $pm = Parallel::Prefork->new({
    max_workers  => 16,
    trap_signals => {
        TERM => 'TERM',
        HUP  => 'TERM',
        USR1 => undef,
    }
});

while ($pm->signal_received ne 'TERM') {
    $pm->start and next;

    $client->can_do('MyApp::TheSchwartz::Worker::Test');    
    my $delay = 10;    # When no job is available, the working process will sleep for $delay seconds
    $client->work( $delay );

    $pm->finish;
}    
$pm->wait_all_children();

5) In the Catalyst controller: insert a new job into the table job and pass some arguments 5)在Catalyst控制器中:将新作业插入表作业并传递一些参数

use MyApp::TheSchwartz::Client qw/theschwartz/;
sub start : Chained('base') PathPart('start') Args(0) {
    my ($self, $c ) = @_;

    $client = theschwartz();
    $client->insert(‘MyApp::TheSchwartz::Worker::Test’, [ $arg1, $arg2 ]);

    $c->response->redirect(
        $c->uri_for(
            $self->action_for('archive'),
            {mid => $c->set_status_msg("Run '$name' started")}
        )
    );
}

The new run is greyed out on the "archive" page until all results are available in the database. 新的运行在“存档”页面上显示为灰色,直到数据库中的所有结果都可用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM