简体   繁体   English

v8/chrome/node.js 函数内联

[英]v8/chrome/node.js function inline

How can I write functions that v8 will inline?如何编写 v8 将内联的函数?

Are there any tools to pre-compile my code to statically inline some functions?是否有任何工具可以预编译我的代码以静态内联某些函数? To statically transform functions and function calls to avoid capturing values?静态转换函数和函数调用以避免捕获值?


Background背景

I noticed that the bottleneck of a JS program I wrote was a very simple function call: I was calling the function in a loop iterating millions of times, and manually inlining the function (ie replacing the function with its code) sped up the code by a few orders of magnitude.我注意到我编写的 JS 程序的瓶颈是一个非常简单的函数调用:我在一个循环中调用该函数,迭代数百万次,并手动内联该函数(即用其代码替换该函数)通过几个数量级。

After that I tried to study the problem for a little, but couldn't infer rules on how function calls are optimized by v8, and how to write efficient functions.之后我试着研究了一下这个问题,但无法推断出v8如何优化函数调用以及如何编写高效函数的规则。


Sample code: iterating 1 billion times示例代码:迭代 10 亿次

  1. incrementing a counter:增加一个计数器:

     let counter = 0; while(counter < 1e9) ++counter;

    it takes about ~1 sec , on my system, both on Google Chrome/Chromium and v8.在我的系统上,Google Chrome/Chromium 和 v8 都需要大约1 秒 ~14 secs iterating 1e10 times. ~14 秒迭代1e10次。

  2. assigning to the counter the value of an incrementing function:将递增函数的值分配给计数器:

     function incr(c) { return c+1; } let counter = 0; while(counter < 1e9) counter = incr(counter);

    it takes about ~1 sec .大约需要~1 秒 ~14 secs iterating 1e10 times. ~14 秒迭代1e10次。

  3. calling a function (declared only once) that increments the captured counter:调用一个函数(只声明一次)来增加捕获的计数器:

     let counter = 0; function incr() { ++counter; } while(counter < 1e9) incr();

    it takes about ~3 sec .大约需要3 秒 ~98 secs iterating 1e10 times. ~98 秒迭代1e10次。

  4. calling a (arrow) function defined in the loop that increments the captured counter:调用循环中定义的(箭头)函数来增加捕获的计数器:

     let counter = 0; while(counter < 1e9) (()=>{ ++counter; })();

    it takes about ~24 secs .大约需要24 秒 (I noticed that a named function or an arrow one makes no difference) (我注意到命名函数或箭头没有区别)

  5. calling a (arrow) function defined in the loop to increment the counter without capturing:调用循环中定义的(箭头)函数来增加计数器而不捕获:

     let counter = 0; while(counter < 1e9) { const incr = (c)=>c+1; counter = incr(counter); }

    it takes about ~22 secs .大约需要22 秒

I'm surprised by the fact that:我对以下事实感到惊讶:

  • capturing a variable slows down the code.捕获变量会减慢代码速度。 Why?为什么? Is this a general rule?这是一般规则吗? Should I always avoid capturing variables in performance critical functions?我应该总是避免在性能关键函数中捕获变量吗?

  • the negative effects of capturing a variable grow a lot when iterating 1e10 times.当迭代 1e10 次时,捕获变量的负面影响会增加很多。 What's going on there?那里发生什么事了? If I had to take a wild guess I'd say that beyond 1^31 the variable changes type, and the function wasn't optimized for this?如果我不得不胡乱猜测,我会说超过 1^31 变量会改变类型,并且函数没有为此进行优化?

  • declaring a function in a loop slows down the code so much.在循环中声明一个函数会大大减慢代码的速度。 v8 doesn't optimize the function at all? v8根本没有优化功能? I thought it was smarter than that!我认为它比那更聪明! I guess I should never declare functions in critical loops...我想我永远不应该在关键循环中声明函数......

  • it makes little difference if the function declared in a loop captures a variable or not.循环中声明的函数是否捕获变量没有什么区别。 I guess capturing a variable is bad for optimized code, but not so bad for not optimized one?我猜捕获一个变量对于优化代码来说是不好的,但对于未优化的代码来说并没有那么糟糕?

  • given all of this, I'm actually surprised v8 can perfectly inline long-lasting non-capturing functions.鉴于所有这些,我实际上很惊讶 v8 可以完美地内联持久的非捕获功能。 I guess these are the only reliable ones performance-wise?我想这些是性能方面唯一可靠的吗?


Edit 1: adding some extra snippets to expose extra weirdness.编辑 1:添加一些额外的片段以暴露额外的奇怪之处。

I created a new file, with the following code inside:我创建了一个新文件,其中包含以下代码:

const start = new Date();
function incr(c) { return c+1; }
let counter = 0;
while(counter < 1e9) counter = incr(counter);
console.log( new Date().getTime() - start.getTime() );

It prints a value closed to ~1 sec .它打印一个接近~1 sec的值。

Then I declared a new variable at the end of the file.然后我在文件末尾声明了一个新变量。 Any variable works fine: just append let x;任何变量都可以正常工作:只需追加let x; to that snipped.到那个剪断。 The code now took ~12 secs to complete.代码现在需要大约 12 秒才能完成。

If instead of using that incr function you just use ++counter as in the very first snippet, the extra variable makes the performance degrade from ~1 sec to ~2.5 secs.如果您不使用该incr函数,而是像在第一个片段中那样使用++counter ,则额外的变量会使性能从 ~1 秒降低到 ~2.5 秒。 Putting these snippets into functions, declaring other variables or changing the order of some statements sometimes improves the performance, while other times degrades it even further.将这些片段放入函数、声明其他变量或更改某些语句的顺序有时会提高性能,而有时会进一步降低性能。

  • WTF?跆拳道?

  • I knew about weird effects like this one , and I've read a bunch of guides on how to optimize JS for v8.我知道像这样的奇怪效果,我已经阅读了一堆关于如何为 v8 优化 JS 的指南。 Still: WTF?!还是:WTF?!

  • I played for a bit with the bottleneck of the JS program that made me start this research.我在 JS 程序的瓶颈上玩了一会儿,这让我开始了这项研究。 I saw a difference of more than 4 orders of magnitude between implementations that I wouldn't have expected to be any different.我发现实现之间存在超过 4 个数量级的差异,我没想到会有任何不同。 I'm currently convinced that the performance of number-crunching algorithms in v8 is completely unpredictable and am going to rewrite the bottleneck in C and expose it as a function to v8.我目前确信 v8 中数字运算算法的性能是完全不可预测的,我将在 C 中重写瓶颈并将其作为函数公开给 v8。

  1. calling a (lambda) function defined in the loop that increments the captured counter调用循环中定义的 (lambda) 函数来增加捕获的计数器
  2. calling a (lambda) function defined in the loop to increment the counter without capturing调用循环中定义的 (lambda) 函数来增加计数器而不捕获

why do you think, that creating 1 billion!!!!!为什么你认为,创造10亿!!!!! identic functions in a loop, may be any good idea?循环中的相同功能,可能是什么好主意? Especially if you only call them once (inside this loop) and then trhow them away.特别是如果你只调用它们一次(在这个循环内)然后把它们扔掉。

Actually I'm impressed on how efficient this insane task is handled by the v8-engine.实际上,我对 v8 引擎处理这项疯狂任务的效率印象深刻。 I would have thought, that it would take at least a few minutes to perform that.我会想,至少需要几分钟才能完成。 Again: we're talking about creating 1 billion functions, and then calling them once.再说一遍:我们正在谈论创建 10 亿个函数,然后调用它们一次。

the negative effects of capturing a variable grow a lot when iterating 1e10 times.当迭代 1e10 次时,捕获变量的负面影响会增加很多。 What's going on there?那里发生什么事了? If I had to take a wild guess I'd say that beyond 1^31 the variable changes type, and the function wasn't optimized for this?如果我不得不胡乱猜测,我会说超过 1^31 变量会改变类型,并且函数没有为此进行优化?

right, beyond 1^31 it's no int32 anymore, but a 64-bit float, that you're with, and out of a sudden, the type has changed => the code get's deoptimized.对,在 1^31 之后,它不再是 int32,而是一个 64 位浮点数,突然之间,类型发生了变化 => 代码被取消了优化。

declaring a function in a loop slows down the code so much.在循环中声明一个函数会大大减慢代码的速度。 v8 doesn't optimize the function at all? v8根本没有优化功能? I thought it was smarter than that!我认为它比那更聪明! I guess I should never use lambdas in critical loops我想我永远不应该在关键循环中使用 lambda

A function is considered for optimization after about 100-150 calls.一个函数在大约 100-150 次调用后被考虑进行优化。 It makes no sense to optimize every last function that's only called once, or twice.优化每个只调用一次或两次的最后一个函数是没有意义的。

it makes little difference if the function declared in a loop captures a variable or not.循环中声明的函数是否捕获变量没有什么区别。 I guess capturing a variable is bad for optimized code, but not so bad for not optimized one?我猜捕获一个变量对于优化代码来说是不好的,但对于未优化的代码来说并没有那么糟糕?

Yes, accessing a captured variable takes a tiny bit longer than accessing a local variable, but that's not the point here;是的,访问捕获的变量比访问局部变量花费的时间稍长,但这不是重点; neither for optimized nor for non-optimized code.既不适用于优化代码,也不适用于非优化代码。 The point here is still that you create 1 billion functions in a loop.这里的重点仍然是你在一个循环中创建了 10 亿个函数。

conclusion: create the function once before the loop, and then call it in the loop.结论:在循环之前创建一次函数然后在循环中调用它。 Then it should not have any significant performance-impact wether you're passing or capturing the variables.那么无论您是传递还是捕获变量,它都不应该对性能产生任何重大影响。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM