简体   繁体   English

在javascript中编写非阻塞for循环的最简洁方法是什么?

[英]What's the cleanest way to write a non-blocking for loop in javascript?

So, I've been thinking about a brain teaser - what if I had a large object I for some reason had to iterate through in node js, and didn't want to block the event loop while I was doing that? 所以,我一直在考虑一个脑筋急转弯 - 如果我有一个大对象,我出于某种原因不得不在节点js中迭代,并且在我这样做的时候不想阻止事件循环怎么办?

Here's an off-the-top-of-my-head example, I'm sure it can be much cleaner: 这是一个非常头脑的例子,我相信它可以更清洁:

var forin = function(obj,callback){
    var keys = Object.keys(obj),
        index = 0,
        interval = setInterval(function(){
            if(index < keys.length){
                callback(keys[index],obj[keys[index]],obj);
            } else {
                clearInterval(interval);
            }
            index ++;
        },0);
}

While I'm sure there are other reasons for it being messy, this will execute slower than a regular for loop, because setInterval 0 doesn't actually execute every 0 ms, but I'm not sure how to make a loop with the much faster process.nextTick. 虽然我确定它有其他原因让它变得混乱,但这会比常规for循环执行得慢,因为setInterval 0实际上并不是每隔0 ms执行一次,但是我不知道如何用很多循环来执行循环更快的process.nextTick。

In my tests, I found this example takes 7 ms to run, as opposed to a native for loop (with hasOwnProperty() checks, logging the same info), which takes 4 ms. 在我的测试中,我发现这个例子需要7毫秒才能运行,而不是原生的for循环(使用hasOwnProperty()检查,记录相同的信息),这需要4毫秒。

So, what's the cleanest/fastest way to write this same code using node.js? 那么,使用node.js编写相同代码的最干净/最快的方法是什么?

The behavior of process.nextTick has changed since the question was asked. 自提出问题以来, process.nextTick的行为已发生变化。 The previous answers also did not follow the question as per the cleanliness and efficiency of the function. 以前的答案也没有按照功能的清洁度和效率来回答问题。

// in node 0.9.0, process.nextTick fired before IO events, but setImmediate did
// not yet exist. before 0.9.0, process.nextTick between IO events, and after
// 0.9.0 it fired before IO events. if setImmediate and process.nextTick are
// both missing fall back to the tick shim.
var tick =
  (root.process && process.versions && process.versions.node === '0.9.0') ?
  tickShim :
  (root.setImmediate || (root.process && process.nextTick) || tickShim);

function tickShim(fn) {setTimeout(fn, 1);}

// executes the iter function for the first object key immediately, can be
// tweaked to instead defer immediately
function asyncForEach(object, iter) {
  var keys = Object.keys(object), offset = 0;

  (function next() {
    // invoke the iterator function
    iter.call(object, keys[offset], object[keys[offset]], object);

    if (++offset < keys.length) {
      tick(next);
    }
  })();
}

Do take note of @alessioalex's comments regarding Kue and proper job queueing. 请注意@ alessioalex关于Kue和正确排队的评论

See also: share-time , a module I wrote to do something similar to the intent of the original question. 另请参阅: share-time ,我写的一个模块,用于执行类似于原始问题的目的。

There are many things to be said here. 这里有很多东西要说。

  • If you have a web application for example, you wouldn't want to do "heavy lifting" in that application's process. 例如,如果您有一个Web应用程序,那么您不希望在该应用程序的过程中执行“繁重的工作”。 Even though your algorithm is efficient, it would still most probably slow down the app. 即使您的算法有效,它仍然可能会减慢应用程序的速度。
  • Depending on what you're trying to achieve, you would probably use one of the following approaches: 根据您要实现的目标,您可能会使用以下方法之一:

    a) put your "for in" loop in a child process and get the result in your main app once it's over a)在子进程中放入“for in”循环,并在结束时将结果输出到主应用程序中
    b) if you are trying to achieve something like delayed jobs (for ex sending emails) you should try https://github.com/LearnBoost/kue b)如果你想要实现延迟工作(例如发送电子邮件),你应该尝试https://github.com/LearnBoost/kue
    c) make a Kue-like program of your own using Redis to communicate between the main app and the "heavy lifting" app. c)使用Redis制作一个类似Kue的程序,在主应用程序和“繁重”应用程序之间进行通信。

For these approaches you could also use multiple processes (for concurrency). 对于这些方法,您还可以使用多个进程(用于并发)。

Now time for a sample code (it may not be perfect, so if you have a better suggestion please correct me): 现在是一个示例代码的时间(它可能不完美,所以如果你有更好的建议,请纠正我):

var forIn, obj;

// the "for in" loop
forIn = function(obj, callback){
  var keys = Object.keys(obj);
  (function iterate(keys) {
    process.nextTick(function () {
      callback(keys[0], obj[keys[0]]);
      return ((keys = keys.slice(1)).length && iterate(keys));
    });
  })(keys);
};

// example usage of forIn
// console.log the key-val pair in the callback
function start_processing_the_big_object(my_object) {
  forIn(my_object, function (key, val) { console.log("key: %s; val: %s;", key, val); });
}

// Let's simulate a big object here
// and call the function above once the object is created
obj = {};
(function test(obj, i) {
  obj[i--] = "blah_blah_" + i;
  if (!i) { start_processing_the_big_object(obj); }
  return (i && process.nextTick(function() { test(obj, i); }));
})(obj, 30000);

Instead of: 代替:

for (var i=0; i<len; i++) {
  doSomething(i);
  }

do something like this: 做这样的事情:

var i = 0, limit;
while (i < len) {
  limit = (i+100);
  if (limit > len)
    limit = len;
  process.nextTick(function(){
     for (; i<limit; i++) {
      doSomething(i);
     }
    });
  }
}

This will run 100 iterations of the loop, then return control to the system for a moment, then pick up where it left off, till its done. 这将循环运行100次迭代,然后将控制权返回给系统片刻,然后从中断处继续,直到完成为止。

Edit: here it is adapted for your particular case (and with the number of iterations it performs at a time passed in as an argument): 编辑:这里它适用于您的特定情况(以及它作为参数传递的迭代次数):

var forin = function(obj, callback, numPerChunk){
  var keys = Object.keys(obj);
  var len = keys.length;
  var i = 0, limit;
  while (i < len) {
    limit = i + numPerChunk;
    if (limit > len)
      limit = len;
    process.nextTick(function(){
        for (; i<limit; i++) {
          callback(keys[i], obj[keys[i]], obj);
        }
      });
  }
}

The following applies to [browser] JavaScript; 以下适用于[浏览器] JavaScript; it may be entirely irrelevant to node.js. 它可能与node.js完全无关。


Two options I know of: 我知道的两个选项:

  1. Use multiple timers to process the queue. 使用多个计时器来处理队列。 They will interleave which will give the net effect of "processing items more often" (this is also a good way to steal more CPU ;-), or, 它们将交错,这将产生“更频繁地处理项目”的净效果(这也是窃取更多CPU的好方法;-),或者,
  2. Do more work per cycle, either count or time based. 每个周期做更多的工作,无论是计数还是基于时间。

I am not sure if Web Workers are applicable/available. 我不确定Web Workers是否适用/可用。

Happy coding. 快乐的编码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM