简体   繁体   English

foldr&foldl Haskell解释

[英]foldr & foldl Haskell explanation

We have been asked to answer whether foldr or foldl is more efficient. 我们被要求回答foldrfoldl是否更有效。

I am not sure, but doesn't it depend on what I am doing, especially what I want to reach with my functions? 我不确定,但这不取决于我在做什么,尤其是我想通过我的功能达到什么目标?

Is there a difference from case to case or can one say that foldr or foldl is better , because... 不同情况是否存在差异,或者可以说foldrfoldl更好,因为......

Is there a general answer ? 有一般答案吗?

Thanks in advance! 提前致谢!

A fairly canonical source on this question is Foldr Foldl Foldl' on the Haskell Wiki. 关于这个问题的一个相当规范的来源是Haskell Wiki上的Foldr Foldl Foldl In summary, depending on how strictly you can combine elements of the list and what the result of your fold is you may decide to choose either foldr or foldl' . 总之,根据您可以如何严格地组合列表的元素以及折叠的结果,您可以决定选择foldrfoldl' It's rarely the right choice to choose foldl . 选择foldl很少是正确的选择。

Generally, this is a good example of how you have to keep in mind the laziness and strictness of your functions in order to compute efficiently in Haskell. 通常,这是一个很好的例子,说明如何在Haskell中有效地计算函数的懒惰和严格性。 In strict languages, tail-recursive definitions and TCO are the name of the game, but those kinds of definitions may be too "unproductive" (not lazy enough) for Haskell leading to the production of useless thunks and fewer opportunities for optimization. 在严格的语言中,尾递归定义和TCO是游戏的名称,但是对于Haskell而言,这些定义可能过于“无效”(不够懒惰)导致产生无用的thunk并且优化机会更少。

When to choose foldr 何时选择foldr

If the operation that consumes the result of your fold can operate lazily and your combining function is non-strict in its right argument, then foldr is usually the right choice. 如果消耗折叠结果的操作可以懒惰地操作而且你的组合函数在其右边的参数中是非严格的,那么foldr通常是正确的选择。 The quintessential example of this is the nonfold . 这方面的典型例子nonfold First we see that (:) is non-strict on the right 首先我们看到(:)在右边是非严格的

head (1 : undefined)
1

Then here's nonfold written using foldr 然后这里是使用foldr编写的非nonfold

nonfoldr :: [a] -> [a]
nonfoldr = foldr (:) []

Since (:) creates lists lazily, an expression like head . nonfoldr 因为(:)懒惰地创建列表,像head . nonfoldr这样的表达式head . nonfoldr head . nonfoldr can be very efficient, requiring just one folding step and forcing just the head of the input list. head . nonfoldr可以非常高效,只需要一个折叠步骤并且只需要输入列表的头部。

head (nonfoldr [1,2,3])
head (foldr (:) [] [1,2,3])
head (1 : foldr (:) [] [2,3])
1

Short-circuiting 短路

A very common place where laziness wins out is in short-circuiting computations. 懒惰胜出的一个非常常见的地方是短路计算。 For instance, lookup :: Eq a => a -> [a] -> Bool can be more productive by returning the moment it sees a match. 例如,通过返回它看到匹配的那一刻, lookup :: Eq a => a -> [a] -> Bool可以提高效率。

lookupr :: Eq a => a -> [a] -> Bool
lookupr x = foldr (\y inRest -> if x == y then True else inRest) False

The short-circuiting occurs because we discard isRest in the first branch of the if . 发生短路是因为我们在if的第一个分支中丢弃了isRest The same thing implemented in foldl' can't do that. foldl'实现的相同的事情不能做到这一点。

lookupl :: Eq a => a -> [a] -> Bool
lookupl x = foldl' (\wasHere y -> if wasHere then wasHere else x == y) False

lookupr 1 [1,2,3,4]
foldr fn False [1,2,3,4]
if 1 == 1 then True else (foldr fn False [2,3,4])
True

lookupl 1 [1,2,3,4]
foldl' fn False [1,2,3,4]
foldl' fn True [2,3,4]
foldl' fn True [3,4]
foldl' fn True [4]
foldl' fn True []
True

When to choose foldl' 何时选择foldl'

If the consuming operation or the combining requires that the entire list is processed before it can proceed, then foldl' is usually the right choice. 如果消费操作或组合需要在可以继续之前处理整个列表,那么foldl'通常是正确的选择。 Often the best check for this situation is to ask yourself whether your combining function is strict---if it's strict in the first argument then your whole list must be forced anyway. 通常,对这种情况的最佳检查是问问自己你的组合功能是否严格 - 如果在第一个参数中它是严格的那么你的整个列表必须被强制。 The quintessential example of this is sum 这方面的典型例子是sum

sum :: Num a => [a] -> a
sum = foldl' (+) 0

since (1 + 2) cannot be reasonably consumed prior to actually doing the addition (Haskell isn't smart enough to know that 1 + 2 >= 1 without first evaluating 1 + 2 ) then we don't get any benefit from using foldr . 因为(1 + 2)在实际添加之前不能合理地消耗(Haskell不够聪明,不知道1 + 2 >= 1而没有先评估1 + 2 )然后我们没有从使用foldr获得任何好处。 Instead, we'll use the strict combining property of foldl' to make sure that we evaluate things as eagerly as needed 相反,我们将使用foldl'严格组合属性来确保我们根据需要急切地评估事物

sum [1,2,3]
foldl' (+) 0 [1,2,3]
foldl' (+) 1 [2,3]
foldl' (+) 3 [3]
foldl' (+) 6 []
6

Note that if we pick foldl here we don't get quite the right result. 请注意,如果我们在这里选择foldl ,我们就得不到相应的结果。 While foldl has the same associativity as foldl' , it doesn't force the combining operation with seq like foldl' does. 虽然foldlfoldl'具有相同的关联性,但它不会强制使用像foldl'那样的seq组合操作。

sumWrong :: Num a => [a] -> a
sumWrong = foldl (+) 0

sumWrong [1,2,3]
foldl (+) 0 [1,2,3]
foldl (+) (0 + 1) [2,3]
foldl (+) ((0 + 1) + 2) [3]
foldl (+) (((0 + 1) + 2) + 3) []
(((0 + 1) + 2) + 3)
((1       + 2) + 3)
(3             + 3)
6

What happens when we choose wrong? 当我们选择错误时会发生什么?

We get extra, useless thunks (space leak) if we choose foldr or foldl when in foldl' sweet spot and we get extra, useless evaluation (time leak) if we choose foldl' when foldr would have been a better choice. 如果我们在foldl'最佳位置选择foldrfoldl ,我们会得到额外的,无用的thunk(空间泄漏),如果我们选择foldl'foldr是更好的选择时,我们会得到额外的,无用的评估(时间泄漏)。

nonfoldl :: [a] -> [a]
nonfoldl = foldl (:) []

head (nonfoldl [1,2,3])
head (foldl (:) []  [1,2,3])
head (foldl (:) [1]   [2,3])
head (foldl (:) [1,2]   [3])  -- nonfoldr finished here, O(1)
head (foldl (:) [1,2,3]  [])
head [1,2,3]
1                             -- this is O(n)

sumR :: Num a => [a] -> a
sumR = foldr (+) 0

sumR [1,2,3]
foldr (+) 0 [1,2,3]
1 + foldr (+) 0 [2, 3]      -- thunks begin
1 + (2 + foldr (+) 0 [3])
1 + (2 + (3 + foldr (+) 0)) -- O(n) thunks hanging about
1 + (2 + (3 + 0)))
1 + (2 + 3)
1 + 5
6                           -- forced O(n) thunks

In languages with strict/eager evaluation, folding from the left can be done in constant space, while folding from the right requires linear space (over the number of elements of the list). 在具有严格/急切评估的语言中,从左侧折叠可以在恒定空间中完成,而从右侧折叠需要线性空间(在列表的元素数量上)。 Because of this, many people who first approach Haskell come over with this preconception. 因此,许多首先接近Haskell的人都会接受这种先入为主的观点。

But that rule of thumb doesn't work in Haskell , because of lazy evaluation. 但是由于懒惰的评估, 这个经验法则在Haskell中不起作用 It's possible in Haskell to write constant space functions with foldr . 在Haskell中可以使用foldr编写常量空间函数。 Here is one example: 这是一个例子:

find :: (a -> Bool) -> [a] -> Maybe a
find p = foldr (\x next -> if p x then Just x else next) Nothing

Let's try hand-evaluating find even [1, 3, 4] : 让我们尝试手工评估find even [1, 3, 4]

-- The definition of foldr, for reference:
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)

find even (1:3:4:[])
    = foldr (\x next -> if even x then Just x else next) (1:3:4:[])
    = if even 1 then Just 1 else foldr (\x next -> if even x then Just x else next) (3:4:[])
    = foldr (\x next -> if even x then Just x else next) (3:4:[])
    = if even 3 then Just 3 else foldr (\x next -> if even x then Just x else next) (4:[])
    = foldr (\x next -> if even x then Just x else next) (4:[])
    = if even 4 then Just 4 else foldr (\x next -> if even x then Just x else next) []
    = Just 4

The size of the expressions in the intermediate steps has a constant upper bound—this actually means that this evaluation can be carried out in constant space. 中间步骤中表达式的大小具有恒定的上限 - 这实际上意味着该评估可以在恒定的空间中执行。

Another reason why foldr in Haskell can run in constant space is because of the list fusion optimizations in GHC . Haskell中的foldr可以在恒定空间中运行的另一个原因是GHC中的列表融合优化 GHC in many cases can optimize a foldr into a constant-space loop over a constant-space producer. 在许多情况下,GHC可以优化foldr到恒定空间生成器上的恒定空间循环。 It cannot generally do that for a left fold. 对于左侧折叠,通常不能这样做。

Nonetheless, left folds in Haskell can be written to use tail recursion, which can lead to performance benefits. 尽管如此,Haskell中的左侧折叠可以编写为使用尾递归,这可以带来性能优势。 The thing is that for this to actually succeed you need to be very careful about laziness—naïve attempts at writing a tail recursive algorithm normally lead to linear-space execution, because of an accumulation of unevaluated expressions. 事实是,为了实现这一点,你需要非常小心懒惰 - 天真地尝试编写尾递归算法通常会导致线性空间执行,因为未评估表达式的积累。

Takeaway lessons: 外卖课程:

  1. When you're starting out in Haskell, try to use library functions from Prelude and Data.List as much as possible, because they've been carefully written to exploit performance features like list fusion. 当你在Haskell开始时,尝试尽可能多地使用PreludeData.List库函数,因为它们已经过仔细编写以利用列表融合等性能特性。
  2. If you need to fold a list, try foldr first. 如果你需要折列表,请尝试foldr第一。
  3. Never use foldl , use foldl' (the version that avoids unevaluated expressions). 永远不要使用foldl ,使用foldl' (避免未评估表达式的版本)。
  4. If you want to use tail-recursion in Haskell, first you need to understand how evaluation works—otherwise you may make things worse. 如果你想在Haskell中使用尾递归,首先你需要了解评估是如何工作的 - 否则你可能会让事情变得更糟。

(Please read the comments on this post. Some interesting points were made and what I wrote here isn't completely true!) (请阅读这篇文章的评论。一些有趣的观点和我在这里写的内容并不完全正确!)

It depends. 这取决于。 foldl is usually faster since it's tail recursive, meaning (sort of), that all computation is done in-place and there's no call-stack. foldl通常更快,因为它的尾递归,意思是(有点),所有计算都是就地完成的,并且没有调用堆栈。 For reference: 以供参考:

foldl f a [] = a
foldl f a (x:xs) = foldl f (f a x) xs

To run foldr we do need a call stack, since there is a "pending" computation for f . 要运行foldr,我们需要一个调用堆栈,因为f有一个“挂起”计算。

foldr f a [] = a
foldr f a (x:xs) = f x (foldr f a xs)

On the other hand, foldr can short-circuit if f is not strict in its first argument. 另一方面,如果f在其第一个参数中不严格,则foldr可能短路。 It's lazier in a way. 它在某种程度上比较懒散 For example, if we define a new product 例如,如果我们定义一个新产品

prod 0 x = 0
prod x 0 = 0
prod x y = x*y

Then 然后

foldr prod 1 [0...n]

Takes constant time in n, but 在n中占用恒定时间,但是

foldl prod 1 [0...n]

takes linear time. 需要线性时间。 (This will not work using (*) , since it does not check if any argument is 0. So we create a non-strict version. Thanks to Ingo and Daniel Lyons for pointing it out in the comments) (这不会使用(*) ,因为它不检查是否有任何参数为0.所以我们创建一个非严格的版本。感谢Ingo和Daniel Lyons在评论中指出它)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM