[英]GHC Performance: Why does more work take *much* less time?
Unfortunately, there is a lot of code involved for the entire example. 不幸的是,整个示例涉及很多代码。 You can see the full module here (which still won't compile), the pseudocode function f below corresponds to the 'FIXME' tag in the hpaste.
你可以在这里看到完整的模块(仍然无法编译),下面的伪代码函数f对应于hpaste中的'FIXME'标签。
Here is a pseudocode outline: 这是一个伪代码大纲:
module Test (run) where
import Data.Vector.Unboxed as U
run m i iters = let {get q} in do print $ testWrapper iters m q
testWrapper :: forall i . Int -> Int -> i -> U.Vector i
testWrapper iters m q =
let {get test params: xs, dim, ru}
in U.map fromIntegral (iterate (f dim ru) xs !! iters)
{-# INLINE f #-}
f :: (Int, Int) -> Vector r -> Vector r -> Vector r
f dim ru = (g dim ru) . zipWith (*) ru
{-# INLINE g #-}
g :: (Int, Int) -> Vector r -> Vector r -> Vector r
g dim ru = ...
For certain parameters, this code runs in ~.5 seconds. 对于某些参数,此代码在~.5秒内运行。
I also tested changing f to f': 我还测试了将f改为f':
f' dim ru = (g dim ru)
(I simply removed the final zipWith, reducing the overall work needed). (我只是删除了最终的zipWith,减少了所需的整体工作)。
On the same input parameters, the modified code takes 4.5 seconds. 在相同的输入参数上,修改后的代码需要4.5秒。
This occurs when compiling with optimizaiton (using GHC 7.4.2, ghc -O2, and also with even more optimizations). 使用optimizaiton进行编译时会发生这种情况(使用GHC 7.4.2,ghc -O2,还有更多优化)。 The core for the fast version is about 3000 lines, while the core for the slow version is about 1900 lines.
快速版本的核心大约是3000行,而慢版本的核心大约是1900行。
This may not be much to go on, but what kind of GHC craziness could be causing my program to slow down by an order of magnitude by reducing the work it does? 这可能不会太多,但是什么样的GHC疯狂可能会导致我的程序通过减少它的工作而减慢一个数量级? How might I discover something like this when essentially my smallest test case generates over 2000 lines of core?
我怎么可能会发现这样的事情时,我基本上最小的测试用例生成2000行核心的?
Thanks 谢谢
Check out the heap profile. 查看堆配置文件。 Can it be that the "less work" version leaves some thunks unevaluated?
是不是“较少工作”的版本会让一些thunk没有评估? This can lead to a large memory footprint, and affect the speed via garbage collection.
这可能会导致大量内存占用,并通过垃圾回收影响速度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.