简体   繁体   English

GHC性能:为什么更多的工作需要花费更少的时间?

[英]GHC Performance: Why does more work take *much* less time?

Unfortunately, there is a lot of code involved for the entire example. 不幸的是,整个示例涉及很多代码。 You can see the full module here (which still won't compile), the pseudocode function f below corresponds to the 'FIXME' tag in the hpaste. 你可以在这里看到完整的模块(仍然无法编译),下面的伪代码函数f对应于hpaste中的'FIXME'标签。

Here is a pseudocode outline: 这是一个伪代码大纲:

module Test (run) where
    import Data.Vector.Unboxed as U

    run m i iters = let {get q} in do print $ testWrapper iters m q

    testWrapper :: forall i . Int -> Int -> i -> U.Vector i
    testWrapper iters m q =
        let {get test params: xs, dim, ru}
        in U.map fromIntegral (iterate (f dim ru) xs !! iters)

    {-# INLINE f #-}
    f :: (Int, Int) -> Vector r -> Vector r -> Vector r
    f dim ru = (g dim ru) . zipWith (*) ru

    {-# INLINE g #-}
    g :: (Int, Int) -> Vector r -> Vector r -> Vector r
    g dim ru = ...

For certain parameters, this code runs in ~.5 seconds. 对于某些参数,此代码在~.5秒内运行。

I also tested changing f to f': 我还测试了将f改为f':

f' dim ru = (g dim ru)

(I simply removed the final zipWith, reducing the overall work needed). (我只是删除了最终的zipWith,减少了所需的整体工作)。

On the same input parameters, the modified code takes 4.5 seconds. 在相同的输入参数上,修改后的代码需要4.5秒。

This occurs when compiling with optimizaiton (using GHC 7.4.2, ghc -O2, and also with even more optimizations). 使用optimizaiton进行编译时会发生这种情况(使用GHC 7.4.2,ghc -O2,还有更多优化)。 The core for the fast version is about 3000 lines, while the core for the slow version is about 1900 lines. 快速版本的核心大约是3000行,而慢版本的核心大约是1900行。

This may not be much to go on, but what kind of GHC craziness could be causing my program to slow down by an order of magnitude by reducing the work it does? 这可能不会太多,但是什么样的GHC疯狂可能会导致我的程序通过减少它的工作而减慢一个数量级? How might I discover something like this when essentially my smallest test case generates over 2000 lines of core? 怎么可能会发现这样的事情时,我基本上最小的测试用例生成2000行核心的?

Thanks 谢谢

Check out the heap profile. 查看堆配置文件。 Can it be that the "less work" version leaves some thunks unevaluated? 是不是“较少工作”的版本会让一些thunk没有评估? This can lead to a large memory footprint, and affect the speed via garbage collection. 这可能会导致大量内存占用,并通过垃圾回收影响速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么unsubscribe()需要这么多时间? - Why does unsubscribe() take so much time? 为什么我在 Sybase 数据库中的查询需要太多时间? - Why does my Query in Sybase database take too much time? 为什么此查询需要那么多时间以及如何加快查询时间? - Why does this query take so much time and how to speed it up? 为什么删除元素比添加元素花费更多的时间? - Why does removing elements take more time than adding elements? 为什么 R 用基本计算计算 m for 循环要花这么多时间? - Why does it take so much time for R to compute m for loop with basic calculations? 为什么在Solr 4.10.x中进行索引更新需要这么长时间? - Why does an index update take so much time in Solr 4.10.x? 为什么在 Java 中将变量重新分配给新字符串需要这么长时间? - Why does reassigning a variable to a new String take so much time in Java? 为什么Matlab的solve函数运行一次后用的时间变少了? - Why does Matlab's solve function take less time after being run once? Excel VBA 如何使代码更高效并花费更少的时间 - Excel VBA How to make code more efficient and take less time Haskell / GHC记忆多少钱? - How much does Haskell/GHC memoize?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM