简体   繁体   English

Haskell foldl'表现不佳(++)

[英]Haskell foldl' poor performance with (++)

I have this code: 我有这个代码:

import Data.List

newList_bad  lst = foldl' (\acc x -> acc ++ [x*2]) [] lst
newList_good lst = foldl' (\acc x -> x*2 : acc) [] lst

These functions return lists with each element multiplied by 2: 这些函数返回列表,每个元素乘以2:

*Main> newList_bad [1..10]
[2,4,6,8,10,12,14,16,18,20]
*Main> newList_good [1..10]
[20,18,16,14,12,10,8,6,4,2]

In ghci: 在ghci:

*Main> sum $ newList_bad [1..15000]
225015000
(5.24 secs, 4767099960 bytes)
*Main> sum $ newList_good [1..15000]
225015000
(0.03 secs, 3190716 bytes)

Why newList_bad function works 200 times slower than newList_good ? 为什么newList_bad函数的工作速度比newList_good慢200倍? I understand that it's not a good solution for that task. 我知道这不是一个很好的解决方案。 But why this innocent code works so slow? 但为什么这个无辜的代码工作得如此之慢?

What is this "4767099960 bytes"?? 这是什么“4767099960字节”?? For that simple an operation Haskell used 4 GiB?? 对于那个简单的操作,Haskell使用4 GiB ??

After compilation: 编译后:

C:\1>ghc -O --make test.hs
C:\1>test.exe
225015000
Time for sum (newList_bad [1..15000]) is 4.445889s
225015000
Time for sum (newList_good [1..15000]) is 0.0025005s

There is much confusion about this issue. 关于这个问题存在很多困惑。 The usual reason given is that "repeatedly appending at end of list requires repeated traversals of list and is thus O(n^2) ". 给出的通常原因是“在列表末尾重复附加需要重复遍历列表,因此是O(n^2) ”。 But it would only be so simple under strict evaluation. 但在严格的评估下,它只会如此简单。 Under lazy evaluation everything is supposed to be delayed, so it begs the question whether there actually are these repeated traversals and appendings at all. 在懒惰的评估下,一切都应该被延迟,所以它引出了一个问题,即是否确实存在这些重复的遍历和附加。 The adding at end is triggered by consuming at front, and since we consume at front the list is getting shorter, so what is the exact timing of these actions is unclear. 最后的添加是通过在前面消耗来触发的,并且由于我们在前面消耗的列表越来越短,因此这些操作的确切时间是不清楚的。 So the real answer is more subtle, and deals with specific reduction steps under lazy evaluation. 因此,真正的答案更为微妙,并在懒惰评估下处理特定的减少步骤。

The immediate culprit is that foldl' only forces its accumulator argument to weak head normal form - ie until a non-strict constructor is exposed. 直接的罪魁祸首是foldl'只强制其累加器参数为弱头正常形式 - 即直到暴露出非严格的构造函数。 The functions involved here are 这里涉及的功能是

(a:b)++c = a:(b++c)    -- does nothing with 'b', only pulls 'a' up
[]++c = c              -- so '++' only forces 1st elt from its left arg

foldl' f z [] = z
foldl' f z (x:xs) = let w=f z x in w `seq` foldl' f w xs

sum xs = sum_ xs 0     -- forces elts fom its arg one by one
sum_ [] a = a
sum_ (x:xs) a = sum_ xs (a+x)

and so actual reduction sequence is (with g = foldl' f ) 实际的减少序列是( g = foldl' f

sum $ foldl' (\acc x-> acc++[x^2]) []          [a,b,c,d,e]
sum $ g  []                                    [a,b,c,d,e]
      g  [a^2]                                   [b,c,d,e]
      g  (a^2:([]++[b^2]))                         [c,d,e]
      g  (a^2:(([]++[b^2])++[c^2]))                  [d,e]
      g  (a^2:((([]++[b^2])++[c^2])++[d^2]))           [e]
      g  (a^2:(((([]++[b^2])++[c^2])++[d^2])++[e^2]))   []
sum $ (a^2:(((([]++[b^2])++[c^2])++[d^2])++[e^2]))

Notice we've only performed O(n) steps so far. 注意到目前为止我们只执行了O(n)步骤。 a^2 is immediately available here for the sum 's consumption, but b^2 is not. a^2立即可用于sum的消耗,但b^2不是。 We're left here with the left-nested structure of ++ expressions. 我们留在这里用左边嵌套的++表达式结构。 The rest is best explained in this answer by Daniel Fischer . Daniel Fischer这个答案中最好地解释了其余部分。 The gist of it is that to get b^2 out, O(n-1) steps will have to be performed - and the structure left in the wake of this access will still be left-nested, so the next access will take O(n-2) steps, and so on - the classic O(n^2) behavior. 它的要点是,为了得到b^2 ,必须执行O(n-1)步骤 - 并且在此访问之后留下的结构仍将是左嵌套的,因此下一次访问将需要O(n-2)步骤,等等 - 经典的O(n^2)行为。 So the real reason is that ++ isn't forcing or rearranging its arguments enough to be efficient . 所以真正的原因是++ 并没有强迫或重新安排其论点足以提高效率

This is actually counter-intuitive. 这实际上是违反直觉的。 We could expect the lazy evaluation to magically "do it" for us here. 我们可以期待懒惰的评估在这里为我们神奇地“做”。 After all we're only expressing out intent to add [x^2] to the end of list in the future , we don't actually do it right away. 毕竟我们只是表达了将来 [x^2]添加到列表末尾的意图,我们实际上并没有立即这样做。 So the timing here is off, but it could be made right - as we access the list, new elements would be added into it and consumed right away , if the timing were right: if c^2 would be added into the list after b^2 (space-wise), say, just before (in time) b^2 would be consumed, the traversal/access would always be O(1) . 因此,这里的时间是关闭的,但它可以做出正确的-就像我们访问列表,新元素将被添加到它和消费向右走 ,如果时机是正确的:如果c^2将被添加到后面的列表b^2 (空间方式),比如说, 就在消耗之前(时间) b^2 ,遍历/访问将始终为O(1)

This is achieved with so-called "difference-list" technique: 这是通过所谓的“差异列表”技术实现的:

newlist_dl lst = foldl' (\z x-> (z . (x^2 :)) ) id lst

which, if you think of it for a moment, looks exactly the same as your ++[x^2] version. 如果你想一下,它看起来与你的++[x^2]版本完全相同。 It expresses the same intent, and leaves left-nested structure too. 它表达了相同的意图,并且也留下了左嵌套结构。

The difference, as explained in that same answer by Daniel Fischer, is that a (.) chain , when first forced, rearranges itself into a right-nested ($) structure 1 in O(n) steps, after which each access is O(1) and the timing of appendings is optimal exactly as described in the above paragraph, so we're left with overall O(n) behaviour. 正如Daniel Fischer在同一个答案中所解释的那样,差异是(.)在第一次被强制时,在O(n)步骤中将自身重新排列成右嵌套($)结构 1 ,之后每次访问都是O(1)并且附加的时间是完全如上段所述的最佳,所以我们留下了整体O(n)行为。


1 which is kind of magical, but it does happen. 1这是一种神奇的,但确实发生了。 :) :)

Classic list behavior. 经典列表行为。

Recall: 召回:

(:)  -- O(1) complexity
(++) -- O(n) complexity

So you are creating an O(n^2) algo, instead of an O(n) one. 所以你创建了一个O(n ^ 2)算法,而不是O(n)算法。

For this common case of appending to lists incrementally, try using a dlist , or just reverse at the end. 对于递增附加到列表的常见情况,请尝试使用dlist ,或者只是在结尾处反向。

To complement the other answers with a bit of larger perspective: with lazy lists, using foldl' in a function that returns a list is usually a bad idea. 用一些更大的视角补充其他答案:使用惰性列表,在返回列表的函数中使用foldl'通常是一个坏主意。 foldl' is often useful when you are reducing a list to a strict (non-lazy) scalar value (eg, summing a list). 当您将列表缩减为严格(非惰性)标量值(例如,对列表求和)时, foldl'通常很有用。 But when you're building a list as the result, foldr is usually better, because of laziness; 但是当你建立一个列表作为结果时, foldr通常会更好,因为懒惰; the : constructor is lazy, so the list's tail isn't computed until it's actually needed. :构造函数是惰性的,因此在实际需要之前不会计算列表的尾部。

In your case: 在你的情况下:

newList_foldr lst = foldr (\x acc -> x*2 : acc) [] lst

This is actually the same as map (*2) : 这实际上与map (*2)

newList_foldr lst = map (*2) lst
map f lst = foldr (\x acc -> f x : acc) [] lst

Evaluation (using the first, map -less definition): 评估(使用第一个,无map定义):

newList_foldr [1..10] 
  = foldr (\x acc -> x*2 : acc) [] [1..10]
  = foldr (\x acc -> x*2 : acc) [] (1:[2..10])
  = 1*2 : foldr (\x rest -> f x : acc) [] [2..10]

This is about as far Haskell will evaluate when newList [1..10] is forced. 这是关于当newList [1..10]被强制时Haskell将评估的内容。 It only evaluates any further if the consumer of this result demands it—and only as little as needed to satisfy the consumer. 如果这个结果的消费者需要它,它只会进一步评估 - 并且只需要满足消费者所需的一小部分。 So for example: 例如:

firstElem [] = Nothing
firstElem (x:_) = Just x

firstElem (newList_foldr [1..10])
  -- firstElem only needs to evaluate newList [1..10] enough to determine 
  -- which of its subcases applies—empty list or pair.
  = firstElem (foldr (\x acc -> x*2 : acc) [] [1..10])
  = firstElem (foldr (\x acc -> x*2 : acc) [] (1:[2..10]))
  = firstElem (1*2 : foldr (\x rest -> f x : acc) [] [2..10])
  -- firstElem doesn't need the tail, so it's never computed!
  = Just (1*2)

This also means that the foldr -based newList can also work with infinite lists: 这也意味着基于foldrnewList也可以使用无限列表:

newList_foldr [1..] = [2,4..]
firstElem (newList_foldr [1..]) = 2

If you use foldl' , on the other hand, you must always compute the whole lists, which also means that you can't work on infinite lists: 另一方面,如果使用foldl' ,则必须始终计算整个列表,这也意味着您无法处理无限列表:

firstElem (newList_good [1..])    -- doesn't terminate

firstElem (newList_good [1..10])
  = firstElem (foldl' (\acc x -> x*2 : acc) [] [1..10])
  = firstElem (foldl' (\acc x -> x*2 : acc) [] (1:[2..10]))
  = firstElem (foldl' (\acc x -> x*2 : acc) [2] [2..10])
  -- we can't short circuit here because the [2] is "inside" the foldl', so 
  -- firstElem can't see it
  = firstElem (foldl' (\acc x -> x*2 : acc) [2] (2:[3..10]))
  = firstElem (foldl' (\acc x -> x*2 : acc) [4,2] [3..10])
    ...
  = firstElem (foldl' (\acc x -> x*2 : acc) [18,16,14,12,10,8,6,4,2] (10:[]))
  = firstElem (foldl' (\acc x -> x*2 : acc) [20,18,16,14,12,10,8,6,4,2] [])
  = firstElem [20,18,16,14,12,10,8,6,4,2]
  = firstElem (20:[18,16,14,12,10,8,6,4,2])
  = Just 20

The foldr -based algorithm took 4 steps to compute firstElem_foldr (newList [1..10]) , whereas the foldl' -based one took in the order of 21 steps. 基于foldr的算法采用4个步骤来计算firstElem_foldr (newList [1..10]) ,而基于foldl'的算法采用21步的顺序。 What's worse is that the 4 steps is a constant cost, whereas the 21 is proportional to the length of the input list— firstElem (newList_good [1..150000]) takes 300,001 steps, while firstElem (newList_foldr [1..150000] takes 5 steps, as does firstElem (newList_foldr [1..] for that matter. 更糟糕的是,4步是恒定成本,而21是与输入列表的长度成比例 - firstElem (newList_good [1..150000])需要300,001步,而firstElem (newList_foldr [1..150000]需要5个步骤,就像firstElem (newList_foldr [1..]那样。

Note also that firstElem (newList_foldr [1.10]) runs in constant space as well as constant time (it has to; you need more than constant time to allocate more than constant space). 还要注意firstElem (newList_foldr [1.10])在恒定空间和常量时间内运行(它必须;你需要的不仅仅是恒定时间来分配超过常量空间)。 The pro- foldl truism from strict languages—" foldl is tail recursive and runs in constant space, foldr is not tail recursive and runs in linear space or worse"—is not true in Haskell. foldl从严格语言-不言而喻“ foldl是尾递归和在恒定空间中运行, foldr不是尾递归和线性空间或更糟运行”在Haskell -is不正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM