简体繁体 English

V8引擎Voodoo：为什么这更快/更慢？

[英]V8 Engine Voodoo: Why is this faster / slower?

原文 2013-06-27 12:56:12 0 2 javascript/ performance/ v8

I am currently working on an image editor and stumbled over this weird behaviour regarding pixel manipulation and/or function calls in V8. 我目前正在研究图像编辑器，并在V8中偶然发现了关于像素操作和/或函数调用的奇怪行为。

http://jsperf.com/canvas-pixelwise-manipulation-performance http://jsperf.com/canvas-pixelwise-manipulation-performance

There are two test cases. 有两个测试用例。 Both test cases should manipulate the image data of an in-memory canvas to increase the brightness. 两个测试用例都应该操作内存中画布的图像数据以增加亮度。 So they have to iterate over every pixel and manipulate the 4 color values of each pixel. 因此，他们必须迭代每个像素并操纵每个像素的4个颜色值。

Case 1 情况1

Case 1 does "1 function call in total" which means that it passes the context and the imageData to a function which then iterates over the pixels and manipulates the data. 情况1执行“总共1个函数调用”，这意味着它将上下文和imageData传递给函数，然后迭代像素并操纵数据。 All in one function 一体化功能

Case 2 案例2

Case 2 does "1 function call per pixel " which means that it iterates over the pixels and calls a method for every pixel, which then manipulates the imageData for the given pixel. 情况2执行“ 每个像素 1次函数调用”，这意味着它迭代像素并为每个像素调用一个方法，然后操纵给定像素的imageData。 This results in (in this case) 250000 additional function calls. 这导致（在这种情况下）250000个额外的函数调用。

My expectation 我的期望

I would expect that case 1 is a lot more faster than case 2 since case 2 is doing 250000 additional function calls. 我希望案例1比案例2快得多，因为案例2正在进行250000个额外的函数调用。

The result 结果

In Chrome, it's exactly the other way around. 在Chrome中，它恰恰相反。 If I do 250000 additional function calls, it's faster than one single function call handling all image manipulations. 如果我执行250000个额外的函数调用，它比处理所有图像处理的单个函数调用更快。

My question: WHY? 我的问题：为什么？

2 个解决方案

Neither code manipulates any canvas and defining a function inside the benchmark loop doesn't really make sense. 这两个代码都没有操纵任何画布，并且在基准测试循环中定义一个函数并没有多大意义。 What you want is static functions that are not re-created ever so that once the JIT has optimized them, they stay optimized. 你想要的是永远不会重新创建的静态函数，这样一旦JIT对它们进行了优化，它们就会保持优化。 You don't want to measure the creation of a function overhead because a real application would only define the function once. 您不希望测量函数开销的创建，因为实际应用程序只会定义一次函数。

Once you fix the benchmark code, they should run at equal speed because the manipulatePixel function will get inlined. 一旦修复了基准代码，它们应该以相同的速度运行，因为manipulatePixel函数将被内联。

http://jsperf.com/canvas-pixelwise-manipulation-performance/4 http://jsperf.com/canvas-pixelwise-manipulation-performance/4

在此输入图像描述

I have also created another jsperf where I purposefully manipulate V8 heuristics* not to inline the manipulatePixel function: 我还创建了另一个jsperf，我故意操纵V8启发式*而不是内联manipulatePixel像素函数：

http://jsperf.com/canvas-pixelwise-manipulation-performance/5 http://jsperf.com/canvas-pixelwise-manipulation-performance/5

在此输入图像描述

As you can see, it's now 50% slower. 如你所见，它现在慢了50％。 The only difference between the 2 jsperfs is the huge comment in the manipulatePixel function. 2 jsperfs之间的唯一区别是manipulatePixel函数中的巨大注释。

*V8 looks at the raw textual size of the function (including comments) as a heuristic in inlining decision . * V8查看函数的原始文本大小（包括注释）作为内联决策的启发式。

I'm not all too familiar with V8's optimalization wizardry, but I'd say that case 2 leaves more room for the V8 engine to rewrite the code. 我对V8的最优化魔法并不是很熟悉，但我会说案例2为V8引擎留下了更多空间来重写代码。
Although, at first glance, case 1 should perform better, but it doesn't leave much room for V8 to work its magic. 虽然乍一看，案例1 应该表现得更好，但它并没有给V8留下太多空间来发挥其魔力。
Though there is only 1 function, a call object is created, within that function object's scope, a couple of variables are declared and a huge object is being processed. 虽然只有一个函数，但是在该函数对象的作用域内创建了一个调用对象，声明了几个变量并且正在处理一个巨大的对象。
The second case, though, might just be transformed into a loop, or even byte-shifts, thus eliminating the need for function objects and scopes. 但是，第二种情况可能只是转换为循环，甚至是字节移位，从而消除了对函数对象和范围的需求。
In addition to the scope/function being omitted, your variables (arguments) needn't be copied, so there's no pesky object references left to cause any overhead. 除了省略的范围/函数之外，不需要复制变量（参数），因此没有任何讨厌的对象引用会导致任何开销。

In addition to variables being copied, and references, there's also scope-scanning to consider: Math.abs called from within a function is (marginally) slower than it is in the global scope. 除了要复制的变量和引用之外，还需要考虑范围扫描：从函数内调用的Math.abs （略微）比在全局范围内慢。 I don't know if this is true or not, but I have this sneaky suspicion that masking variables that were declared in a higher scope might impact performance, too. 我不知道这是否属实，但我有这种偷偷摸摸的怀疑，即在较高范围内声明的屏蔽变量也可能影响性能。
You're also using width and height in the one-function-approach , which look to me as though they are implied globals. 你也在单函数方法中使用width和height ，这看起来好像它们是隐含的全局变量。 This causes a scope-scan on every iteration of the loops, which will probably cause more drag than those arguments and Math.* calls... 这会导致对循环的每次迭代进行范围扫描，这可能会导致比那些参数和Math.*调用更多的拖动...