简体   繁体   English

在无限列表上左右折叠

[英]Left and Right Folding over an Infinite list

I have issues with the following passage from Learn You A Haskell (Great book imo, not dissing it): 我对“ 了解您的Haskell”一文的以下段落有疑问(很棒的imo书,不是废话):

One big difference is that right folds work on infinite lists, whereas left ones don't! 一个很大的不同是,右折叠在无限列表上起作用,而左折叠则不行! To put it plainly, if you take an infinite list at some point and you fold it up from the right, you'll eventually reach the beginning of the list. 简而言之,如果您在某个时间点取一个无限的列表,然后将其从右侧折叠起来,最终您将到达列表的开头。 However, if you take an infinite list at a point and you try to fold it up from the left, you'll never reach an end! 但是,如果您在某个点取一个无限的列表,然后尝试将其从左向上折叠,那么您将永远无法结束!

I just don't get this. 我就是不明白这一点。 If you take an infinite list and try to fold it up from the right then you'll have to start at the point at infinity, which just isn't happening (If anyone knows of a language where you can do this do tell :p). 如果您要获取一个无限列表,然后尝试从右侧将其折叠起来,那么您就必须从无穷大的那一点开始,这几乎是没有发生的(如果有人知道您可以使用的语言,请告诉:p )。 At least, you'd have to start there according to Haskell's implementation because in Haskell foldr and foldl don't take an argument that determines where in the list they should start folding. 至少,您必须根据Haskell的实现从那里开始,因为在Haskell foldr和foldl中,不需要使用确定在列表中应该从何处开始折叠的参数。

I would agree with the quote iff foldr and foldl took arguments that determined where in the list they should start folding, because it makes sense that if you take an infinite list and start folding right from a defined index it will eventually terminate, whereas it doesn't matter where you start with a left fold; 我同意引号iff foldr和foldl的参数确定了它们应该在列表中的何处开始折叠,因为这是有意义的,如果您采用无限列表并从已定义的索引开始就开始折叠,则它最终终止,而不会不管您从左折处开始, you'll be folding towards infinity. 您将向无限折叠。 However foldr and foldl do not take this argument, and hence the quote makes no sense. 但是foldr和foldl 接受此参数,因此引号没有意义。 In Haskell, both a left fold and a right fold over an infinite list will not terminate . 在Haskell中,无限列表上的左折和右折都不会终止

Is my understanding correct or am I missing something? 我的理解正确吗?或者我缺少什么?

The key here is laziness. 这里的关键是懒惰。 If the function you're using for folding the list is strict, then neither a left fold nor a right fold will terminate, given an infinite list. 如果用于折叠列表的函数很严格,则给定无限列表,向左折叠或向右折叠都不会终止。

Prelude> foldr (+) 0 [1..]
^CInterrupted.

However, if you try folding a less strict function, you can get a terminating result. 但是,如果尝试折叠不太严格的功能,则会得到终止结果。

Prelude> foldr (\x y -> x) 0 [1..]
1

You can even get a result that is an infinite data structure, so while it does in a sense not terminate, it's still able to produce a result that can be consumed lazily. 您甚至可以得到一个无限数据结构的结果,因此尽管它在某种意义上不会终止,但仍然可以产生可以延迟使用的结果。

Prelude> take 10 $ foldr (:) [] [1..]
[1,2,3,4,5,6,7,8,9,10]

However, this will not work with foldl , as you will never be able to evaluate the outermost function call, lazy or not. 但是,这不适用于foldl ,因为您将永远无法评估最外部的函数调用,无论是否延迟。

Prelude> foldl (flip (:)) [] [1..]
^CInterrupted.
Prelude> foldl (\x y -> y) 0 [1..]
^CInterrupted.

Note that the key difference between a left and a right fold is not the order in which the list is traversed, which is always from left to right, but rather how the resulting function applications are nested. 请注意,左右折叠之间的关键区别不是列表的遍历顺序(始终从左到右),而是嵌套的结果函数应用程序的顺序。

  • With foldr , they are nested on "the inside" 使用文件foldr ,它们嵌套在“内部”

     foldr fy (x:xs) = fx (foldr fy xs) 

    Here, the first iteration will result in the outermost application of f . 在这里,第一次迭代将导致f的最外层应用。 Thus, f has the opportunity to be lazy so that the second argument is either not always evaluated, or it can produce some part of a data structure without forcing its second argument. 因此, f有机会变得懒惰,以便不总是对第二个参数进行求值,或者它可以产生数据结构的某些部分而不会强迫其第二个参数。

  • With foldl , they are nested on "the outside" 使用foldl ,它们嵌套在“外部”

     foldl fy (x:xs) = foldl f (fyx) xs 

    Here, we can't evaluate anything until we have reached the outermost application of f , which we will never reach in the case of an infinite list, regardless of whether f is strict or not. 在这里,我们无法评估任何东西,直到我们到达f的最外层应用为止,无论f是否严格,在无限列表的情况下我们都无法达到。

The key phrase is "at some point". 关键词是“在某个时候”。

if you take an infinite list at some point and you fold it up from the right, you'll eventually reach the beginning of the list. 如果您在某个时间点取了一个无限列表然后将其从右侧折叠起来,最终您将到达列表的开头。

So you're right, you can't possibly start at the "last" element of an infinite list. 因此,您是对的,您不可能从无限列表的“最后一个”元素开始。 But the author's point is this: suppose you could. 但是作者的观点是:假设可以。 Just pick a point waaay far out there (for engineers, this is "close enough" to infinity) and start folding leftwards. 只需在远处选择一个点(对于工程师来说,这“足够接近”到无穷大)并开始向左折叠。 Eventually you end up at the start of the list. 最终,您最终会在列表的开头。 The same is not true of the left fold, if you pick a point waaaay out there (and call it "close enough" to the start of the list), and start folding rightwards, you still have an infinite way to go. 对于左折,情况并非如此,如果您在那儿选择一个点waaaay(并将其称为“足够接近”到列表的开头),然后开始向右折,您还有无限的路要走。

So the trick is, sometimes you don't need to go to infinity. 因此,诀窍在于,有时您无需达到无穷大。 You may not need to even go waaaay out there. 您可能甚至不需要去那里。 But you may not know how far out you need to go beforehand, in which case infinite lists are quite handy. 但是您可能不知道需要走多远,在这种情况下,无限列表非常方便。

The simple illustration is foldr (:) [] [1..] . 简单的示例是foldr (:) [] [1..] Let's perform the fold. 让我们进行折叠。

Recall that foldr fz (x:xs) = fx (foldr fz xs) . 回想一下,文件foldr fz (x:xs) = fx (foldr fz xs) On an infinite list, it actually doesn't matter what z is so I'm just keeping it as z instead of [] which clutters the illustration 在无限列表上, z实际上并不重要,因此我只是将其保留为z而不是[] ,这会使插图混乱

foldr (:) z (1:[2..])         ==> (:) 1 (foldr (:) z [2..])
1 : foldr (:) z (2:[3..])     ==> 1 : (:) 2 (foldr (:) z [3..])
1 : 2 : foldr (:) z (3:[4..]) ==> 1 : 2 : (:) 3 (foldr (:) z [4..])
1 : 2 : 3 : ( lazily evaluated thunk - foldr (:) z [4..] )

See how foldr , despite theoretically being a fold from the right , in this case actually cranks out individual elements of the resultant list starting at the left ? 看看文件foldr如何折叠,尽管从理论上说是从右侧折叠的,但实际上是从左侧开始将结果列表中的各个元素都拉出来了吗? So if you take 3 from this list, you can clearly see that it will be able to produce [1,2,3] and need not evaluate the fold any farther. 因此,如果您从此列表中take 3 ,则可以清楚地看到它将能够产生[1,2,3]并且无需进一步评估折数。

Remember in Haskell you can use infinite lists because of lazy evaluation. 请记住,在Haskell中,由于延迟计算,您可以使用无限列表。 So, head [1..] is just 1, and head $ map (+1) [1..] is 2, even though `[1..] is infinitely long. 因此, head [1..]仅为1,而head $ map (+1) [1..]为2,即使`[1 ..]无限长。 If you dont get that, stop and play with it for a while. 如果没有得到,请停下来玩一会儿。 If you do get that, read on... 如果您知道,请继续阅读...

I think part of your confusion is that the foldl and foldr always start at one side or the other, hence you dont need to give a length. 我认为,造成混淆的部分原因是, foldlfoldr总是从一侧或另一侧开始,因此您无需给出长度。

foldr has a very simple definition foldr定义很简单

 foldr _ z [] = z
 foldr f z (x:xs) = f x $ foldr f z xs

why might this terminate on infinite lists, well try 为什么这会在无限列表上终止,请尝试

 dumbFunc :: a -> b -> String
 dumbFunc _ _ = "always returns the same string"
 testFold = foldr dumbFunc 0 [1..]

here we pass into foldr a "" (since the value doesn't matter) and the infinite list of natural numbers. 在这里,我们将一个“”(因为值无关紧要)和无穷自然数列表传递到文件foldr Does this terminate? 这会终止吗? Yes. 是。

The reason it terminates is because Haskell's evaluation is equivalent to lazy term rewriting. 它终止的原因是因为Haskell的评估等同于惰性术语重写。

So 所以

 testFold = foldr dumbFunc "" [1..]

becomes (to allow pattern matching) 变为(允许模式匹配)

 testFold = foldr dumbFunc "" (1:[2..])

which is the same as (from our definition of fold) 与(根据我们对fold的定义)相同

 testFold = dumbFunc 1 $ foldr dumbFunc "" [2..]

now by the definition of dumbFunc we can conclude 现在通过dumbFunc的定义,我们可以得出结论

 testFold = "always returns the same string"

This is more interesting when we have functions that do something, but are sometimes lazy. 当我们具有执行某些功能但有时很懒的功能时,这会更有趣。 For example 例如

foldr (||) False 

is used to find if a list contains any True elements. 用于查找列表是否包含任何True元素。 We can use this to define the higher order functionn any which returns True if and only if the passed in function is true for some element of the list 我们可以使用它来定义高阶函数,只要且仅当传入的函数对于列表的某些元素为true时, any函数均返回True

any :: (a -> Bool) -> [a] -> Bool
any f = (foldr (||) False) . (map f)

The nice thing about lazy evaluation, is that this will stop when it encounters the first element e such that fe == True 关于惰性评估的好处是,当遇到第一个元素e时,它将停止,例如fe == True

On the other hand, this isn't true of foldl . 另一方面, foldl并非如此。 Why? 为什么? Well a really simple foldl looks like 好吧,一个非常简单的foldl看起来像

foldl f z []     = z                  
foldl f z (x:xs) = foldl f (f z x) xs

Now, what would have happened if we tried our example above 现在,如果我们尝试上面的示例会发生什么

testFold' = foldl dumbFunc "" [1..]
testFold' = foldl dumbFunc "" (1:[2..])

this now becomes: 现在变成:

testFold' = foldl dumbFunc (dumbFunc "" 1) [2..]

so 所以

testFold' = foldl dumbFunc (dumbFunc (dumbFunc "" 1) 2) [3..]
testFold' = foldl dumbFunc (dumbFunc (dumbFunc (dumbFunc "" 1) 2) 3) [4..]
testFold' = foldl dumbFunc (dumbFunc (dumbFunc (dumbFunc (dumbFunc "" 1) 2) 3) 4) [5..]

and so on and so on. 等等等等。 We can never get anywhere, because Haskell always evaluates the outermost function first (that is lazy evaluation in a nutshell). 我们永远无法到达任何地方,因为Haskell总是首先评估最外部的函数(简而言之就是惰性评估)。

One cool consequence of this is that you can implement foldl out of foldr but not vice versa. 一个很酷的结果是,您可以在foldr之外实现foldl ,反之亦然。 This means that in some profound way foldr is the most fundamental of all the higher order string functions, since it is the one we use to implement almost all the others. 这意味着在某种程度上, foldr是所有高阶字符串函数中最基础的,因为它是我们用于实现几乎所有其他函数的函数。 You still might want to use a foldl sometimes, because you can implement foldl tail recursively, and get some performance gain from that. 有时您仍可能需要使用foldl ,因为您可以递归实现foldl tail,并从中获得一些性能提升。

There is good plain explanation on Haskell wiki . Haskell Wiki上有很好的解释。 It shows step-by-step reduction with different types of fold and accumulator functions. 它显示了使用不同类型的折叠和累加器功能的逐步还原。

Your understanding is correct. 您的理解是正确的。 I wonder if the author is trying to talk about Haskell's lazy evaluation system (in which you can pass an infinite list to various functions not including fold, and it will only evaluate however much is needed to return the answer). 我想知道作者是否正在尝试谈论Haskell的惰性评估系统(在该系统中,您可以将无限列表传递给不包括fold的各种函数,并且仅评估需要多少才能返回答案)。 but I agree with you that the author isn't doing a good job describing anything in that paragraph, and what it says is wrong. 但我同意您的观点,即作者在描述该段中的内容时做得不好,并且说错了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM