[英]V8 Engine Voodoo: Why is this faster / slower?
I am currently working on an image editor and stumbled over this weird behaviour regarding pixel manipulation and/or function calls in V8. 我目前正在研究图像编辑器,并在V8中偶然发现了关于像素操作和/或函数调用的奇怪行为。
http://jsperf.com/canvas-pixelwise-manipulation-performance http://jsperf.com/canvas-pixelwise-manipulation-performance
There are two test cases. 有两个测试用例。 Both test cases should manipulate the image data of an in-memory canvas to increase the brightness.
两个测试用例都应该操作内存中画布的图像数据以增加亮度。 So they have to iterate over every pixel and manipulate the 4 color values of each pixel.
因此,他们必须迭代每个像素并操纵每个像素的4个颜色值。
Case 1 does "1 function call in total" which means that it passes the context and the imageData to a function which then iterates over the pixels and manipulates the data. 情况1执行“总共1个函数调用”,这意味着它将上下文和imageData传递给函数,然后迭代像素并操纵数据。 All in one function
一体化功能
Case 2 does "1 function call per pixel " which means that it iterates over the pixels and calls a method for every pixel, which then manipulates the imageData for the given pixel. 情况2执行“ 每个像素 1次函数调用”,这意味着它迭代像素并为每个像素调用一个方法,然后操纵给定像素的imageData。 This results in (in this case) 250000 additional function calls.
这导致(在这种情况下)250000个额外的函数调用。
I would expect that case 1 is a lot more faster than case 2 since case 2 is doing 250000 additional function calls. 我希望案例1比案例2快得多,因为案例2正在进行250000个额外的函数调用。
In Chrome, it's exactly the other way around. 在Chrome中,它恰恰相反。 If I do 250000 additional function calls, it's faster than one single function call handling all image manipulations.
如果我执行250000个额外的函数调用,它比处理所有图像处理的单个函数调用更快。
Neither code manipulates any canvas and defining a function inside the benchmark loop doesn't really make sense. 这两个代码都没有操纵任何画布,并且在基准测试循环中定义一个函数并没有多大意义。 What you want is static functions that are not re-created ever so that once the JIT has optimized them, they stay optimized.
你想要的是永远不会重新创建的静态函数,这样一旦JIT对它们进行了优化,它们就会保持优化。 You don't want to measure the creation of a function overhead because a real application would only define the function once.
您不希望测量函数开销的创建,因为实际应用程序只会定义一次函数。
Once you fix the benchmark code, they should run at equal speed because the manipulatePixel
function will get inlined. 一旦修复了基准代码,它们应该以相同的速度运行,因为
manipulatePixel
函数将被内联。
http://jsperf.com/canvas-pixelwise-manipulation-performance/4 http://jsperf.com/canvas-pixelwise-manipulation-performance/4
I have also created another jsperf where I purposefully manipulate V8 heuristics* not to inline the manipulatePixel
function: 我还创建了另一个jsperf,我故意操纵V8启发式*而不是内联
manipulatePixel
像素函数:
http://jsperf.com/canvas-pixelwise-manipulation-performance/5 http://jsperf.com/canvas-pixelwise-manipulation-performance/5
As you can see, it's now 50% slower. 如你所见,它现在慢了50%。 The only difference between the 2 jsperfs is the huge comment in the
manipulatePixel
function. 2 jsperfs之间的唯一区别是
manipulatePixel
函数中的巨大注释。
*V8 looks at the raw textual size of the function (including comments) as a heuristic in inlining decision . * V8查看函数的原始文本大小 (包括注释)作为内联决策的启发式。
I'm not all too familiar with V8's optimalization wizardry, but I'd say that case 2 leaves more room for the V8 engine to rewrite the code. 我对V8的最优化魔法并不是很熟悉,但我会说案例2为V8引擎留下了更多空间来重写代码。
Although, at first glance, case 1 should perform better, but it doesn't leave much room for V8 to work its magic. 虽然乍一看,案例1 应该表现得更好,但它并没有给V8留下太多空间来发挥其魔力。
Though there is only 1 function, a call object is created, within that function object's scope, a couple of variables are declared and a huge object is being processed. 虽然只有一个函数,但是在该函数对象的作用域内创建了一个调用对象,声明了几个变量并且正在处理一个巨大的对象。
The second case, though, might just be transformed into a loop, or even byte-shifts, thus eliminating the need for function objects and scopes. 但是,第二种情况可能只是转换为循环,甚至是字节移位,从而消除了对函数对象和范围的需求。
In addition to the scope/function being omitted, your variables (arguments) needn't be copied, so there's no pesky object references left to cause any overhead. 除了省略的范围/函数之外,不需要复制变量(参数),因此没有任何讨厌的对象引用会导致任何开销。
In addition to variables being copied, and references, there's also scope-scanning to consider: Math.abs
called from within a function is (marginally) slower than it is in the global scope. 除了要复制的变量和引用之外,还需要考虑范围扫描:从函数内调用的
Math.abs
(略微)比在全局范围内慢。 I don't know if this is true or not, but I have this sneaky suspicion that masking variables that were declared in a higher scope might impact performance, too. 我不知道这是否属实,但我有这种偷偷摸摸的怀疑,即在较高范围内声明的屏蔽变量也可能影响性能。
You're also using width
and height
in the one-function-approach , which look to me as though they are implied globals. 你也在单函数方法中使用
width
和height
,这看起来好像它们是隐含的全局变量。 This causes a scope-scan on every iteration of the loops, which will probably cause more drag than those arguments and Math.*
calls... 这会导致对循环的每次迭代进行范围扫描,这可能会导致比那些参数和
Math.*
调用更多的拖动...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.