简体   繁体   English

enumFromTo如何在haskell中工作,以及哪些优化加速了GHC实现与天真实现之间的关系

[英]how does enumFromTo work in haskell and what optimizations speed up the GHC implementation vs naive implementations

I'm learning haskell, and one of the exercises required that I write a function equivalent to enumFromTo . 我正在学习haskell,其中一个练习要求我编写一个与enumFromTo相当的函数。

I came up with the following two implementations: 我想出了以下两个实现:

eft' :: Enum a => a -> a -> [a]
eft' x y = go x y []
  where go a b sequence
          | fromEnum b < fromEnum a = sequence
          | otherwise = go (succ a) b (sequence ++ [a])

eft :: Enum a => a -> a -> [a]
eft x y = go x y []
  where go a b sequence
          | fromEnum b < fromEnum a = sequence
          | otherwise = go a (pred b) (b : sequence)

I had a hunch that the first version does more work, as it puts each element into a list and concatenates to the existing sequence , while the second version prepends a single element to a list. 我有一种预感,第一个版本做了更多工作,因为它将每个元素放入一个列表并连接到现有sequence ,而第二个版本将单个元素添加到列表中。 Is this the main reason for the performance difference or are there other significant factors or is my hunch slightly off the mark? 这是性能差异的主要原因还是有其他重要因素,还是我的预感略有偏差?

Testing in ghci with :set +s reveals on my machine (Windows 10, GHC 8.2.2, intel i7-4770HQ): 在ghci中测试:set +s在我的机器上显示(Windows 10,GHC 8.2.2,intel i7-4770HQ):

*Lists> take 10 (eft 1 10000000)
[1,2,3,4,5,6,7,8,9,10]
(9.77 secs, 3,761,292,096 bytes)
*Lists> take 10 (eft' 1 10000000)
[1,2,3,4,5,6,7,8,9,10]
(27.97 secs, 12,928,385,280 bytes)
*Lists> take 10 (enumFromTo 1 10000000)
[1,2,3,4,5,6,7,8,9,10]
(0.00 secs, 1,287,664 bytes)

My second hunch was that the take 10 (eft 1 10000000) should perform better than take 10 (eft' 10000000) because the latter has to build the list up all the way from 10000000 to 10 before it can return any useful values that we take . 我的第二个预感是, take 10 (eft 1 10000000)应该比take 10 (eft' 10000000) take 10 (eft 1 10000000)表现更好,因为后者必须从10000000到10之间建立列表才能返回我们take任何有用的值。 Clearly this hunch was wrong, and I'm hoping someone can explain why. 显然这种预感是错误的,我希望有人可以解释原因。

Finally, the ghc implementation is incredibly more efficient than my naive implementations. 最后,ghc实现比我的天真实现更高效。 I am curious to understand other optimizations have been applied to speed it up. 我很想知道已经应用了其他优化来加速它。 The answer to this similarly titled SO question shares some code that seems to be from the ghc implementation, but doesn't explain how the "nastiness" gains efficiency. 这个类似标题为SO问题的答案分享了一些似乎来自ghc实现的代码,但没有解释“肮脏”如何提高效率。

The problem with eft is that it still requires the entire list to be built, regardless of your attempt to cut it down with take 10 . eft的问题在于它仍然需要构建整个列表,无论您是否尝试使用take 10来减少它。 Tail recursion is not your friend when you want to build things lazily . 当你想懒洋洋地构建东西时,尾递归不是你的朋友 What you want is guarded recursion (ie recursive calls right behind the relevant constructor, as in foldr , so that they can be left unevaluated when you don't need them): 你想要的是保护递归 (即在相关构造函数后面的递归调用,如在foldr ,这样当你不需要它们时它们可以不被评估):

eft'' :: Enum a => a -> a -> [a]
eft'' x y
    | fromEnum y < fromEnum x = []
    | otherwise = x : eft'' (succ x) y
GHCi> take 10 (eft 1 10000000)
[1,2,3,4,5,6,7,8,9,10]
(7.48 secs, 2,160,291,096 bytes)
GHCi> take 10 (eft'' 1 10000000)
[1,2,3,4,5,6,7,8,9,10]
(0.00 secs, 295,752 bytes)
GHCi> take 10 (enumFromTo 1 10000000)
[1,2,3,4,5,6,7,8,9,10]
(0.00 secs, 293,680 bytes)

As for eft' being worse than eft , that indeed has to do with (++) . 至于eft'eft更糟,那确实与(++) For reference, here are definitions for take and (++) (I'm using the Report definitions rather than the GHC ones , but the slight differences don't actually matter here): 作为参考,这里是take(++) 的定义 (我使用报告定义而不是GHC ,但这里的细微差别实际上并不重要):

take                   :: Int -> [a] -> [a]  
take n _      | n <= 0 =  []  
take _ []              =  []  
take n (x:xs)          =  x : take (n-1) xs 

(++) :: [a] -> [a] -> [a]  
[]     ++ ys = ys  
(x:xs) ++ ys = x : (xs ++ ys)

If you hand-evaluate eft , you get to see how it has to build the entire list before giving you any element: 如果您手工评估eft ,您可以在给出任何元素之前了解如何构建整个列表:

take 3 (eft 1 5)
take 3 (go 1 5 [])
take 3 (go 1 4 (5 : []))
take 3 (go 1 3 (4 : 5 : []))
-- etc.
take 3 (1 : 2 : 3 : 4 : 5 : [])
1 : take 2 (2 : 3 : 4 : 5 : [])
1 : 2 : take 1 (3 : 4 : 5 : [])
-- etc.

At least, though, once you get past the go s the list is ready for consumption. 但至少,一旦你超越了go s,列表就可以消费了。 That is not the case with eft' -- the (++) still have to be dealt with, and doing so is linear with respect to the length of the list: 对于eft'情况并非如此 - (++)仍然需要处理,并且这样做与列表的长度成线性关系:

take 3 (eft' 1 5)
take 3 (go 1 5 [])
take 3 (go 2 5 ([] ++ [1]))
take 3 (go 3 5 (([] ++ [1]) ++ [2]))
-- etc.
take 3 ((((([] ++ [1]) ++ [2]) ++ [3]) ++ [4]) ++ [5])
take 3 (((([1] ++ [2]) ++ [3]) ++ [4]) ++ [5])
take 3 ((((1 : ([] ++ [2])) ++ [3]) ++ [4]) ++ [5])
take 3 ((((1 : [2]) ++ [3]) ++ [4]) ++ [5])
take 3 (((1 : ([2] ++ [3])) ++ [4]) ++ [5])
-- etc.
take 3 (1 : ((([2] ++ [3]) ++ [4]) ++ [5]))
1 : take 2 ((([2] ++ [3]) ++ [4]) ++ [5])

It gets worse: you have to do it again with the remaining tail of the list for every single element! 它变得更糟:你必须再次使用列表的剩余尾部为每个元素做!

1 : take 2 ((([2] ++ [3]) ++ [4]) ++ [5])
1 : take 2 (((2 : ([] ++ [3])) ++ [4]) ++ [5])
1 : take 2 (((2 : [3]) ++ [4]) ++ [5])
1 : take 2 ((2 : ([3] ++ [4])) ++ [5])
-- etc.
1 : take 2 (2 : (([3] ++ [4]) ++ [5]))
1 : 2 : take 1 (([3] ++ [4]) ++ [5])
-- etc.

In fact, the take 10 disguises the fact that eft' , unlike eft , is quadratic: 事实上, take 10伪装了这样一个事实,即eft'eft不同,是二次方的:

GHCi> last $ eft' 1 10000
10000
(1.83 secs, 4,297,217,200 bytes)
GHCi> last $ eft' 1 20000
20000
(7.59 secs, 17,516,804,952 bytes)
GHCi> last $ eft 1 5000000
5000000
(3.81 secs, 1,080,282,784 bytes)
GHCi> last $ eft 1 10000000
10000000
(7.51 secs, 2,160,279,232 bytes)

For the sake of completeness, here is the corresponding hand-evaluation for eft'' : 为了完整起见,这里是eft''的相应手工评估:

take 3 (eft'' 1 5)
take 3 (1 : eft'' 2 5)
1 : take 2 (eft'' 2 5) -- No need to evaluate `eft'' 2 5` to get the first element.
1 : take 2 (2 : eft'' 3 5)
1 : 2 : take 1 (eft'' 3 5)
-- etc.
1 : 2 : 3 : take 0 (eft'' 4 5) -- No need to go further.
1 : 2 : 3 : []

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM