简体   繁体   中英

Nodejs batch processing

A bit of conceptual question

I have 15 (for example) files that need to be processed. But i dont want to process them one at a time. Instead i want to start processing 5 of them (any 5 the order is not important) and as long one of these 5 files is processed another one to be started. The idea is to have max 5 files being processed at the same time until all files are processed.

Trying to work this out in Node but in general im missing the idea how this can be implemented

A more accurate name for this type of processing might be 'limited parallel execution'. Mario Casciaro covers this well in his book, Node.js Design Patterns beginning on page 77. One use case for this pattern is when you want to control a set of parallel tasks that could cause excessive load. The example below is from his book.

Limited Parallel Execution Pattern

function TaskQueue(concurrency) {
  this.concurrency = concurrency;
  this.running = 0;
  this.queue = [];
}

TaskQueue.prototype.pushTask = function(task, callback) {
  this.queue.push(task);
  this.next();
}

TaskQueue.prototype.next = function() {
  var self = this;
  while(self.running < self.concurrency && self.queue.length) {
    var task = self.queue.shift();
    task(function(err) {
      self.running--;
      self.next();
    });
    self.running++;
  }
}

You can do what you want by code below, but I am confused why you want to do this?

  function handle(file) {
    new Promise(function(resolve, reject) {
      doSomething(file, function(err) {
        if(err)
          reject(err);
        else
          resolve();
      });
    })
    .then(function() {
      handle(files.shift());
    });
  }

  var files = [1, 2, ....., 15];
  var max = 5;
  while(max--) {
    handle(files.shift());
  }

Here's a little example that simulates multiple workers reading from a central queue of work: https://jsfiddle.net/ctrlfrk/jsvyg69h/1/

// Fake "work" that is simply a task that takes as many milliseconds as its value.
const workQueue = [1000,4000,2000,4000,5000,3000,7000,1000,9000,9000,4000,2000,1000,3000,8000,2000,3000,7000,6000,30000];


const Worker = (name) => (channel) => {
  const history = [];
  const next = () => {
    const job = channel.getWork();
    if (!job) { // All done!
      console.log('Worker ' + name + ' completed');
      return;
    }
    history.push(job);
    console.log('Worker ' + name + ' grabbed new job:' + job +'. History is:', history);

    window.setTimeout(next, job); //job is just the milliseconds.
  };
  next();
}

const Channel = (queue) => {
  return { getWork: () => {
    return queue.pop();
  }};
};

let channel = Channel(workQueue);
let a = Worker('a')(channel);
let b = Worker('b')(channel);
let c = Worker('c')(channel);
let d = Worker('d')(channel);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM