简体   繁体   English

在Javascript(V8)中,为什么数组上的forEach比简单的for循环消耗更多的内存?

[英]In Javascript (V8) why does forEach on an array consume much more memory than a simple for loop?

I'm performing some simple data validation on a large set of data in Node.js (version v7.5.0, with a matrix of 15849x12771 entries). 我正在对Node.js中的大量数据执行一些简单的数据验证(版本v7.5.0,矩阵为15849x12771条目)。 The entire data set is in memory for now, for performance reasons. 出于性能原因,整个数据集现在都在内存中。 Therefore it is critical for me to reduce the amount of memory consumed to a theoretical minimum (each number representing 8 bytes in JS). 因此,对于我来说,将消耗的内存量减少到理论最小值(每个数字代表JS中的8个字节)至关重要。

Please compare the following ways of achieving the same thing. 请比较以下实现相同方法的方法。

with forEach forEach

  regressData.forEach((yxa, yxaIndex) => {
    yxa.forEach((yx, yxIndex) => {
      if (!_.isFinite(yx)) {
        throw new Error(`non-finite entry at [${yxaIndex}, ${yxIndex}]`);
      }
    });
  });

This consumes all of my node process' memory at 4GB+, causing it to never (until my patience runs out anyway) finish the loop (I guess it will use slower swap memory). 这消耗了我4GB +的所有节点进程内存,导致它永远不会(直到我耐心耗尽)完成循环(我猜它会使用较慢的交换内存)。

And then the identical version with a typical for : 然后同一版采用了典型for

  for (var yxai = 0, yxal = regressData.length; yxai < yxal; yxai++) {
    const yx = regressData[yxai];
    for (var yxi = 0, yxl = yx.length; yxi < yxl; yxi++) {
      if (!_.isFinite(yx[yxi])) {
        throw new Error(`non-finite entry at [${yxai}, ${yxi}]`);
      }
    }
  }

This consumes virtually no extra memory, causing the validation to be done in less than a second. 这几乎不消耗额外的内存,导致验证在不到一秒的时间内完成。

Is this behavior as expected? 这种行为是否符合预期? I had anticipated that because the forEach s have closed scopes there would be no issues of additional memory usage when compared to a traditional for loop. 我曾预料到,因为forEach已经关闭了范围,所以与传统的for循环相比,不存在额外的内存使用问题。

EDIT: standalone test 编辑:独立测试

node --expose-gc test_foreach.js node --expose-gc test_foreach.js

if (!gc) throw new Error('please run node like node --expose-gc test_foreach.js');

const _ = require('lodash');

// prepare data to work with

const x = 15849;
const y = 12771;

let regressData = new Array(x);
for (var i = 0; i < x; i++) {
  regressData[i] = new Array(y);
  for (var j = 0; j < y; j++) {
    regressData[i][j] = _.random(true);
  }
}

// for loop
gc();
const mb_pre_for = _.round(process.memoryUsage().heapUsed / 1024 / 1024, 2);
console.log(`memory consumption before for loop ${mb_pre_for} megabyte`);
validateFor(regressData);
gc();
const mb_post_for = _.round(process.memoryUsage().heapUsed / 1024 / 1024, 2);
const mb_for = _.round(mb_post_for - mb_pre_for, 2);
console.log(`memory consumption by for loop ${mb_for} megabyte`);

// for each loop
gc();
const mb_pre_foreach = _.round(process.memoryUsage().heapUsed / 1024 / 1024, 2);
console.log(`memory consumption before foreach loop ${mb_pre_foreach} megabyte`);
validateForEach(regressData);
gc();
const mb_post_foreach = _.round(process.memoryUsage().heapUsed / 1024 / 1024, 2);
const mb_foreach = _.round(mb_post_foreach - mb_pre_foreach, 2);
console.log(`memory consumption by foreach loop ${mb_foreach} megabyte`);

function validateFor(regressData) {
  for (var yxai = 0, yxal = regressData.length; yxai < yxal; yxai++) {
    const yx = regressData[yxai];
    for (var yxi = 0, yxl = yx.length; yxi < yxl; yxi++) {
      if (!_.isFinite(yx[yxi])) {
        throw new Error(`non-finite entry at [${yxai}, ${yxi}]`);
      }
    }
  }
};

function validateForEach(regressData) {
  regressData.forEach((yxa, yxaIndex) => {
    yxa.forEach((yx, yxIndex) => {
      if (!_.isFinite(yx)) {
        throw new Error(`non-finite entry at [${yxaIndex}, ${yxIndex}]`);
      }
    });
  });
};

Output: 输出:

toms-mbp-2:mem_test tommedema$ node --expose-gc test_foreach.js
memory consumption before for loop 1549.31 megabyte
memory consumption by for loop 0.31 megabyte
memory consumption before foreach loop 1549.66 megabyte
memory consumption by foreach loop 3087.9 megabyte

(V8 developer here.) This is an unfortunate consequence of how Array.forEach is implemented in V8's old execution pipeline (full codegen + Crankshaft). (这里是V8开发人员。)这是如何在V8的旧执行管道(完全代码生成器+ Crankshaft)中实现Array.forEach一个不幸结果。 In short, what happens is that under some circumstances, using forEach on an array changes the internal representation of that array to a much less memory efficient format. 简而言之,在某些情况下,在数组上使用forEach会将该数组的内部表示更改为内存效率更低的格式。 (Specifically: if the array contained only double values before, and forEach has also been used on arrays with elements of other types but not too many different kinds of objects, and the code runs hot enough to get optimized. It's fairly complicated ;-) ) (具体来说:如果数组之前只包含双值,并且forEach也用于具有其他类型元素但没有太多不同类型对象的数组,并且代码运行得足够热以进行优化。它相当复杂;-) )

With the new execution pipeline (currently behind the --future flag, will be turned on by default soon), I'm no longer seeing this additional memory consumption. 使用新的执行管道(当前位于--future标志后面,默认情况下将很快打开),我不再看到这种额外的内存消耗。

(That said, classic for loops do tend to have a small performance advantage over forEach , just because there's less going on under the hood (per ES spec). In many real workloads, the difference is too small to matter, but in microbenchmarks it's often visible. We might be able to optimize away more of forEach 's overhead in the future, but in cases where you know that every CPU cycle matters, I recommend using plain old for (var i = 0; i < array.length; i++) loops.) (也就是说,经典的for循环确实比forEach具有更小的性能优势,仅仅是因为在引擎盖下(根据ES规范)更少。在许多实际工作负载中,差异太小而无关紧要,但在微基准测试中,它是我们或许可以在将来优化掉更多forEach的开销,但是如果你知道每个CPU周期都很重要,我建议使用plain old for (var i = 0; i < array.length; i++)循环。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在V8中,为什么预分配的阵列消耗更少的内存? - In V8 why does a preallocated array consume less memory? 为什么 integer 键的 JavaScript (v8) 中的“映射”操作比“对象”慢得多? - Why “Map” manipulation is much slower than “Object” in JavaScript (v8) for integer keys? V8 在哪里实际使用原始 javascript 代码? - where does V8 actually consume the raw javascript code? 为什么 v8::JSON::Parse from v8 比 NodeJS JSON::parse 慢得多? - Why v8::JSON::Parse from v8 much slower than NodeJS JSON::parse? 为什么在v8中运行非常简单的脚本(嵌入c ++)会占用内存? - Why does running a very simple script in v8, embedded in c++, use up memory? JavaScript声明的函数在加载脚本时消耗多少内存? - How much memory does a JavaScript declared function consume on script loading? 为什么将值归零比取消定义更快(javascript v8) - Why is nulling a value faster than undefining it (javascript v8) 为什么在这种情况下v8会耗尽内存? - Why does v8 run out of memory in this situation? 命名事件处理程序是否比 JavaScript 中的匿名事件处理程序消耗更多 memory? - Are named event handlers consume more memory than anonymous ones in JavaScript? 在JavaScript中,Object.keys()。forEach()的内存效率比简单的for ... in循环要少吗? - In JavaScript is Object.keys().forEach() less memory efficient than a simple for…in loop?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM