简体   繁体   English

V8引擎Voodoo:为什么这更快/更慢?

[英]V8 Engine Voodoo: Why is this faster / slower?

I am currently working on an image editor and stumbled over this weird behaviour regarding pixel manipulation and/or function calls in V8. 我目前正在研究图像编辑器,并在V8中偶然发现了关于像素操作和/或函数调用的奇怪行为。

http://jsperf.com/canvas-pixelwise-manipulation-performance http://jsperf.com/canvas-pixelwise-manipulation-performance

There are two test cases. 有两个测试用例。 Both test cases should manipulate the image data of an in-memory canvas to increase the brightness. 两个测试用例都应该操作内存中画布的图像数据以增加亮度。 So they have to iterate over every pixel and manipulate the 4 color values of each pixel. 因此,他们必须迭代每个像素并操纵每个像素的4个颜色值。

Case 1 情况1

Case 1 does "1 function call in total" which means that it passes the context and the imageData to a function which then iterates over the pixels and manipulates the data. 情况1执行“总共1个函数调用”,这意味着它将上下文和imageData传递给函数,然后迭代像素并操纵数据。 All in one function 一体化功能

Case 2 案例2

Case 2 does "1 function call per pixel " which means that it iterates over the pixels and calls a method for every pixel, which then manipulates the imageData for the given pixel. 情况2执行“ 每个像素 1次函数调用”,这意味着它迭代像素并为每个像素调用一个方法,然后操纵给定像素的imageData。 This results in (in this case) 250000 additional function calls. 这导致(在这种情况下)250000个额外的函数调用。

My expectation 我的期望

I would expect that case 1 is a lot more faster than case 2 since case 2 is doing 250000 additional function calls. 我希望案例1比案例2快得多,因为案例2正在进行250000个额外的函数调用。

The result 结果

In Chrome, it's exactly the other way around. 在Chrome中,它恰恰相反。 If I do 250000 additional function calls, it's faster than one single function call handling all image manipulations. 如果我执行250000个额外的函数调用,它比处理所有图像处理的单个函数调用更快。

My question: WHY? 我的问题:为什么?

Neither code manipulates any canvas and defining a function inside the benchmark loop doesn't really make sense. 这两个代码都没有操纵任何画布,并且在基准测试循环中定义一个函数并没有多大意义。 What you want is static functions that are not re-created ever so that once the JIT has optimized them, they stay optimized. 你想要的是永远不会重新创建的静态函数,这样一旦JIT对它们进行了优化,它们就会保持优化。 You don't want to measure the creation of a function overhead because a real application would only define the function once. 您不希望测量函数开销的创建,因为实际应用程序只会定义一次函数。

Once you fix the benchmark code, they should run at equal speed because the manipulatePixel function will get inlined. 一旦修复了基准代码,它们应该以相同的速度运行,因为manipulatePixel函数将被内联。

http://jsperf.com/canvas-pixelwise-manipulation-performance/4 http://jsperf.com/canvas-pixelwise-manipulation-performance/4

在此输入图像描述

I have also created another jsperf where I purposefully manipulate V8 heuristics* not to inline the manipulatePixel function: 我还创建了另一个jsperf,我故意操纵V8启发式*而不是内联manipulatePixel像素函数:

http://jsperf.com/canvas-pixelwise-manipulation-performance/5 http://jsperf.com/canvas-pixelwise-manipulation-performance/5

在此输入图像描述

As you can see, it's now 50% slower. 如你所见,它现在慢了50%。 The only difference between the 2 jsperfs is the huge comment in the manipulatePixel function. 2 jsperfs之间的唯一区别是manipulatePixel函数中的巨大注释。


*V8 looks at the raw textual size of the function (including comments) as a heuristic in inlining decision . * V8查看函数原始文本大小 (包括注释)作为内联决策的启发式。

I'm not all too familiar with V8's optimalization wizardry, but I'd say that case 2 leaves more room for the V8 engine to rewrite the code. 我对V8的最优化魔法并不是很熟悉,但我会说案例2为V8引擎留下了更多空间来重写代码。
Although, at first glance, case 1 should perform better, but it doesn't leave much room for V8 to work its magic. 虽然乍一看,案例1 应该表现得更好,但它并没有给V8留下太多空间来发挥其魔力。
Though there is only 1 function, a call object is created, within that function object's scope, a couple of variables are declared and a huge object is being processed. 虽然只有一个函数,但是在该函数对象的作用域内创建了一个调用对象,声明了几个变量并且正在处理一个巨大的对象。
The second case, though, might just be transformed into a loop, or even byte-shifts, thus eliminating the need for function objects and scopes. 但是,第二种情况可能只是转换为循环,甚至是字节移位,从而消除了对函数对象和范围的需求。
In addition to the scope/function being omitted, your variables (arguments) needn't be copied, so there's no pesky object references left to cause any overhead. 除了省略的范围/函数之外,不需要复制变量(参数),因此没有任何讨厌的对象引用会导致任何开销。

In addition to variables being copied, and references, there's also scope-scanning to consider: Math.abs called from within a function is (marginally) slower than it is in the global scope. 除了要复制的变量和引用之外,还需要考虑范围扫描:从函数内调用的Math.abs (略微)比在全局范围内慢。 I don't know if this is true or not, but I have this sneaky suspicion that masking variables that were declared in a higher scope might impact performance, too. 我不知道这是否属实,但我有这种偷偷摸摸的怀疑,即在较高范围内声明的屏蔽变量也可能影响性能。
You're also using width and height in the one-function-approach , which look to me as though they are implied globals. 你也在单函数方法中使用widthheight ,这看起来好像它们是隐含的全局变量。 This causes a scope-scan on every iteration of the loops, which will probably cause more drag than those arguments and Math.* calls... 这会导致对循环的每次迭代进行范围扫描,这可能会导致比那些参数和Math.*调用更多的拖动...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM