简体   繁体   English

Haskell,无限列表上的文件夹

[英]Haskell, foldr on infinite list

I got a question while reading LYAH. 我在阅读LYAH时遇到问题。

Here is my thought of foldr on infinite list: 这是我对无限列表上的文件夹的想法:

foldr (:) [] [1..] = 1:2:...:**∞:[]**

I think GHCi does not know it is list before evaluating ∞:[] . 我认为GHCi在评估∞:[]之前不知道它是列表。

But GHCi do know. 但是GHCI确实知道。

So i thought it can recognize foldr (:) [] [infinite list] = [infinite list itself]. 因此,我认为它可以识别文件夹(:) [] [无限列表] = [无限列表本身]。

Prelude> [1..10] == (take 10 $ foldr (:) [] [1..])
True
Prelude> [1..] == (foldr (:) [] [1..])
Interrupted.

However It wasn't. 但是事实并非如此。

I want to know what actually happens when GHCi recognizes it is [1..] before evaluating ∞:[]. 我想知道在评估∞:[]之前,当GHCi识别为[1 ..]时实际会发生什么。

Just type inference prior to that evaluation? 只是在评估之前键入推断?

I want to know what actually happens when GHCi recognizes it is [1..] before evaluating ∞:[] . 我想知道GHCi在评估∞:[]之前认识到[1..]时会发生什么。

GHCi does not recognizes that it is [1...] , this is only the consequence of lazy evaluation. GHCI 承认,这是[1...]这是只有评价的结果。

foldr is implemented as: foldr实现为:

foldr _ z [] = z
foldr f z (x:xs) = f x (foldr f z xs)

If you write something like foldr (:) [] [1..] , then Haskell does not evalautes this (directly), it only stores that you want to calculate that. 如果你喜欢写东西foldr (:) [] [1..]然后Haskell 没有 evalautes这(直接),它只是要计算出店。

Now say you for instance want to print (take 3 (foldr (:) [] [1..])) that list, then Haskell is forced to evaluate that, and it will do so by calculating: 现在假设您要print (take 3 (foldr (:) [] [1..]))该列表,然后Haskell被迫对其进行评估,它将通过计算:

take 3 (foldr (:) [] [1..])
-> take 3 ((:) 1 (foldr (:) [] [2..]))
-> (:) 1 (take 2 (foldr (:) [] [2..]))
-> (:) 1 (take 2 ((:) 2 (foldr (:) [] [3..]))
-> (:) 1 ((:) 2 (take 1 (foldr (:) [] [3..])))
-> (:) 1 ((:) 2 (take 1 ((:) 3 (foldr (:) [] [4..]))))
-> (:) 1 ((:) 2 ((:) 3 (take 0 (foldr (:) [] [4..]))))
-> (:) 1 ((:) 2 ((:) 3 [])

so it derives [1, 2, 3] , and due to Haskell's lazyness, it is not interested in what foldr (:) [] [4..] is. 因此它派生[1, 2, 3] ,并且由于Haskell的惰性,它对什么是文件foldr (:) [] [4..]并不感兴趣。 Even if that list would eventually stop, it is simply not evaluated. 即使该列表最终将停止,也不会对其进行评估。

If you calculate something like [1..] = foldr (:) [] [1..] , then Haskell will check for list equality, list equality is defined as: 如果计算类似[1..] = foldr (:) [] [1..] ,则Haskell将检查列表是否相等,列表相等定义为:

[] == [] = True
(x:xs) == (y:ys) = x == y && xs == ys
[] == (_:_) = False
(_:_) == [] = False

So Haskell is forced to unwind the list of the right foldr , but it will keep doing so, until it finds items that are not equal, or one of the list reaches the end. 所以Haskell是被迫放松右边的列表foldr ,但它会继续这样做,直到发现不相等的物品,或列表中的一个到达终点。 But since each time the elements are equal, and both lists never end, it will never finish, si it will evaluate it like: 但是由于每次元素都是相等的,并且两个列表都不会结束,所以它将永远不会结束,因为它将像下面这样评估它:

   (==) [1..] (foldr (:) [] [1..])
-> (==) ((:) 1 [2..])  ((:) 1 (foldr (:) [] [2..]))

It sees that both are equal, so it recursively calls: 它看到两者相等,因此递归调用:

-> (==) ((:) 1 [2..])  ((:) 1 (foldr (:) [] [2..]))
-> (==) [2..] foldr (:) [] [2..])
-> (==) ((:) 2 [3..])  ((:) 2 (foldr (:) [] [3..]))
-> (==) [3..] foldr (:) [] [3..])
-> ...

But as you can see, it will never stop evaluation. 但是正如您所看到的,它永远不会停止评估。 Haskell does not know that foldr (:) [] [1..] is equal to [1..] , it aims to evaluate it, and since equality forces it to evaluate the entire list, it will get stuck in an infinite loop. Haskell不知道 foldr (:) [] [1..]等于[1..] ,它的目的是求值,并且由于相等性迫使它求值整个列表,因此它会陷入无限循环。

Yes it would be possible to add a certain pattern in the compiler, such that foldr (:) [] x is replaced with x , and so in the future perhaps a Haskell compiler could return True for these, but this would not solve the problems fundamentally , since if Haskell could derive such things for any type of function (here (:) , then it would solve an undecidable problem, hence it is not possible). 是的,有可能在编译器中添加某种模式,例如将foldr (:) [] x替换为x ,因此将来Haskell编译器可能会针对这些返回True ,但这不能解决问题。 从根本上说 ,如果Haskell可以针对任何类型的函数(在这里(:)派生此类事物,那么它将解决一个不确定的问题,因此是不可能的。

Ghc (at least in theory) doesn't know the difference between a finite list and an infinite list. Ghc(至少在理论上)不知道有限列表和无限列表之间的区别。 It can tell that a list is finite by calculating its length. 通过计算列表的长度可以知道列表是有限的。 If you try to find the length of an infinite list, you're going to have a bad time as your program will never terminate. 如果您尝试找到一个无限列表的长度,那么您的时间将会很糟糕,因为您的程序将永远不会终止。

This question is really about lazy evaluation. 这个问题实际上是关于惰性评估的。 In a strict language like C or python, you need to know the whole value of something at every step. 在像C或python这样的严格语言中,您需要在每一步都知道某物的全部价值。 If you want to add up the elements of a list, you already need to know what things are in it and how many there are before you start. 如果要添加列表中的元素,则在开始之前已经需要知道列表中有什么以及有多少东西。

All data in Haskell has the following form: Haskell中的所有数据具有以下形式:

  1. A primitive fully known thing like an int (not necessarily integer) 基本的众所周知的东西,例如一个int(不一定是整数)
  2. A data type constructor and its arguments, eg True , Left 7 , (,) 5 'f' (which is the same as (5,'f') ), or (:) 3 [] 数据类型构造函数及其参数,例如TrueLeft 7(,) 5 'f' (与(5,'f') )或(:) 3 []

But in Haskell values come in two “shapes” 但是在Haskell中,价值有两种“形状”

  • known and fully evaluated (like in C or python) 已知且经过充分评估(例如在C或python中)
  • not yet evaluated—a function “thunk” that when you call will return the value in a more evaluated way 尚未评估-函数“ thunk”在您调用时将以更评估的方式返回值

In Haskell there is a concept called weak head normal form in which: 在Haskell中,有一个称为弱头范式的概念,其中:

  • anything primitive is fully evaluated 任何原语都经过充分评估
  • for anything with a constructor, the constructor is known and the arguments may not have been evaluated yet. 对于具有构造函数的任何事物,构造函数都是已知的,并且可能尚未评估参数。

Let's look at the evaluation process for foldr (:) [] [1..] . 让我们看一下文件foldr (:) [] [1..]的评估过程。 First the definition of foldr 首先定义文件foldr

foldr f a [] = a
foldr f a (x:xs) = f x (foldr xs)

Now what is foldr (:) [] [1..] ? 现在什么是文件foldr (:) [] [1..]

foldr (:) [] [1..]

It seems it's just a thunk. 看来这只是一个笨拙。 We don't know anything about it yet. 我们对此一无所知。 So let's evaluate it into WHNF. 因此,让我们将其评估为WHNF。 First we need to convert the argument [1..] (which is actually enumFrom 1 ) to WHNF so we can pattern match on it: 首先,我们需要将参数[1..] (实际上是enumFrom 1 )转换为WHNF,以便可以对其进行模式匹配:

foldr (:) [] (1:[2..])

And now we can evaluate foldr: 现在我们可以评估文件夹:

(:) 1 (foldr [] [2..])
1 : (foldr [] [2..])

Thus we have calculated the first element of the list without having to look at its whole infinite length. 因此,我们计算了列表的第一个元素,而不必查看其整个无限长度。 Similarly we can work out the second element and so on. 同样,我们可以计算出第二个元素,依此类推。

So what happens if we do [1..] == [1..] ? 那么,如果我们执行[1..] == [1..]什么? Well the definition for == for lists is (omitting three cases) 好,列表的==的定义是(省略了三种情况)

(x:xs) == (y:ys) = x == y && xs == ys

So trying to reduce to WHNF we get: 因此,尝试减少到​​WHNF,我们得到:

[1..] == [1..]
(1 == 1) && ([2..] == [2..])
True && ([2..] == [2..])
[2..] == [2..]
... and so on

Thus we keep on going forever and never get to a constructor which we can use to pattern match (ie inspect) the result on. 因此,我们一直坚持下去,永远都不会找到可以用来对结果进行模式匹配(即检查)的构造函数。

Note that we can cancel out True && ... because the definition of && doesn't look at its second argument: 需要注意的是,我们可以抵消True && ...因为定义&&不看它的第二个参数:

True && x = x
False && _ = False

If we defined && with a full four way truth table, the program could run out of memory much faster (provided the compiler didn't do anything clever) than the above where instead you will just run out of patience (or a cosmic ray hits your ram and makes your program return False ) 如果我们用完整的四向真值表定义&& ,则程序可能比上述速度更快(如果编译器没有做任何聪明的事情)用完内存(相反,您将没有耐心(或宇宙射线命中)您的ram并使程序返回False

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM