简体   繁体   English

使用kue for node.js的唯一作业

[英]unique jobs with kue for node.js

I would like that the jobs.create fails if an identical job is already in the system. 如果系统中已有相同的作业,我希望jobs.create失败。 Is there any way to acomplish this? 有没有办法来实现这个?

I need to run the same job every 24 hours, but some jobs could take even more than 24 hours, so I need to be sure that the job isn't already in the system (active, queued o failed) before adding it. 我需要每24小时运行一次相同的工作,但有些工作可能需要超过24小时,所以我需要确保在添加它之前该工作尚未在系统中(活动,排队o失败)。

UPDATED : Ok, I going to simplify the problem to be able to explain it here. 更新 :好的,我将简化问题,以便能够在这里解释它。 Lest say I have an analytics service and I have to send a report to my users once a day. 为免我说我有分析服务,我必须每天向用户发送一次报告。 Completing these reports some times(just a few cases but it is a possibility) take several hours even more than a day. 有时完成这些报告(只有少数情况但有可能)需要几个小时甚至一天以上。

I need a way to know which are the currently running jobs to avoid duplicated jobs. 我需要一种方法来了解哪些是当前正在运行的作业以避免重复的作业。 I couldn't find anything in the ´´´´kue´´´´ API to know which jobs are currently running. 我在'''''''''API中找不到任何东西来知道当前正在运行的作业。 Also I need some kind of event fired when more jobs are needed and then call my getMoreJobs producer. 此外,我需要在需要更多作业时触发某种事件,然后调用我的getMoreJobs生成器。

Maybe my approach is wrong, if so please let me know a better way to solve my problem. 也许我的方法是错误的,如果是这样,请让我知道一个更好的方法来解决我的问题。

This is my simplified code: 这是我的简化代码:

var kue = require('kue'),   
    cluster = require('cluster'),
    numCPUs = require('os').cpus().length;

numCPUs = CONFIG.sync.workers || numCPUs; 

var jobs = kue.createQueue();

if (cluster.isMaster) {
    console.log('Starting master pid:' + process.pid);
    jobs.on('job complete', function(id){
    kue.Job.get(id, function(err, job){
        if (err || !job) return;
        job.remove(function(err){
            if (err) throw err;
            console.log('removed completed job #%d', job.id);
        });
    });

    function getMoreJobs() {
        console.log('looking for more jobs...');
        getOutdateReports(function (err, reports) {
            if (err) return setTimeout(getMoreJobs, 5 * 60 * 60 * 1000);

            reports.forEach(function(report) {
                jobs.create('reports', {
                    id: report.id,
                    title: report.name,
                    params: report.params
                }).attempts(5).save();
            });

            setTimeout(getMoreJobs, 60 * 60 * 1000);
        });
    }

    //Create the jobs
    getMoreJobs();

    console.log('Starting ', numCPUs, ' workers');
    for (var i = 0; i < numCPUs; i++) {
        cluster.fork();
    }

    cluster.on('death', function(worker) {
        console.log('worker pid:' + worker.pid + ' died!'.bold.red);
    });

} else {
    //Process the jobs
    console.log('Starting worker pid:' + process.pid);
    jobs.process('reports', 20, function(job, done){
        //completing my work here
        veryHardWorkGeneratingReports(function(err) {
            if (err) return done(err);
            return done();
        });
    });
}

The answer to one of your questions is that Kue puts the jobs that it pops off of the redis queue into "active", and you'll never get them unless you look for them. 你的一个问题的答案是,Kue把它从redis队列中弹出的作业变成“活动”,除非你找到它们,否则你永远不会得到它们。

The answer to the other question is that your distributed work queue is the consumer, not the producer of tasks. 另一个问题的答案是,您的分布式工作队列是消费者,而不是任务的生产者。 Mingling them like you have is okay, but, it's a muddy paradigm. 像你一样明白他们是好的,但是,这是一个泥泞的范例。 What I've done with Kue is to make a wrapper for kue's json api, so that a job can be put into the queue from anywhere in the system. 我用Kue做的是为kue的json api创建一个包装器,这样就可以从系统的任何地方将一个作业放入队列。 Since you seem to have a need to shovel jobs in, I suggesting writing a separate producer application that does nothing but get external jobs and stick them into your Kue work queue. 由于您似乎需要挖掘作业,我建议编写一个单独的生产者应用程序,除了获取外部作业并将其粘贴到您的Kue工作队列中之外什么都不做。 It can monitor the work queue for when jobs are running low and load a batch in, or, what I would do, is make it shovel jobs in as fast as it can, and spool up multiple instances of your consumer application to process the load more quickly. 它可以监视作业运行时的工作队列并加载批处理,或者我会做什么,使其尽可能快地铲除作业,并清空消费者应用程序的多个实例以处理负载更快速。

To re-iterate: Your separation of concerns isn't very good here. 重新进行迭代:这里关注点的分离并不是很好。 You should have a producer of tasks that's completely separate from your task consumer app. 您应该拥有一个与任务消费者应用程序完全分离的任务生产者。 This gives you more flexibility, ease of scale (Just fire up another consumer on another machine and you're scaled!) and overall ease of code management. 这为您提供了更大的灵活性,易于扩展(只需在另一台机器上启动另一个用户,并且您可以扩展!)以及整体易于代码管理。 You should also allow, if possible, whomever is giving you these tasks that you "go looking for" access to your Kue server's JSON api instead of going out and finding them. 如果可能的话,您还应该允许任何人为您“寻找”访问您的Kue服务器的JSON api而不是外出并找到它们。 The job producer can schedule its own tasks with Kue. 工作生产者可以使用Kue安排自己的任务。

Look at https//github.com/LearnBoost/kue . 看看https // github.com / LearnBoost / kue

In json.js script check rows 64-112. 在json.js脚本中检查行64-112。 There you'll find methods which return an object containing jobs, also filtered with type, state or id-range. 在那里,您将找到返回包含作业的对象的方法,这些方法也使用类型,状态或id范围进行过滤。 ( jobRange() , jobStateRange() , jobTypeRange() .) jobRange()jobStateRange()jobTypeRange() 。)

Scrolling down the main page to JSON API -section, you'll find the examples of the returned objects. 向下滚动主页面到JSON API -section,您将找到返回对象的示例。

That how to call and use those methods you know much better than I do. 如何调用和使用你比我更了解的那些方法。

jobs.create() will fail, if you pass an unknown keyword. 如果传递未知关键字, jobs.create()将失败。 I would created a function to check the current job in forEach -loop, and returns a keyword. 我会创建一个函数来检查forEach -loop中的当前作业,并返回一个关键字。 Then just call this function instead of literal keyword in jobs.create() -parameters. 然后在jobs.create()参数中调用此函数而不是literal关键字。

Information got through those methods in json.js, may help you create that "moreJobToDo"-event too. 通过json.js中的这些方法获得的信息,可以帮助您创建“moreJobToDo” - 事件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM