简体   繁体   English

列表推导中的条款

[英]where clauses in list comprehensions

What is the difference between the following two formulas? 以下两个公式有什么区别?

cp [] = [[]]
cp (xs:xss) = [x:ys | x <- xs, ys <- cp xss]
----------------------------------------------
cp [] = [[]]
cp (xs:xss) = [x:ys | x <- xs, ys <- yss]
              where yss = cp xss

Sample output: cp [[1,2,3],[4,5]] => [[1,4],[1,5],[2,4],[2,5],[3,4],[3,5]] 样本输出: cp [[1,2,3],[4,5]] => [[1,4],[1,5],[2,4],[2,5],[3,4],[3,5]]

According to Thinking Functionally With Haskell (p. 92), the second version is "a more efficient definition...[which] guarantees that cp xss is computed just once," though the author never explains why. 根据与Haskell的功能性思考 (p.92),第二个版本是“更有效的定义... [保证cp xss只计算一次”,尽管作者从未解释过为什么。 I would have thought they were equivalent. 我原以为他们是等同的。

The two definitions are equivalent in the sense that they denote the same value, of course. 当然,这两个定义在它们表示相同值的意义上是等价的。

Operationally they differ in the sharing behavior under call-by-need evaluation. 在操作上,它们在按需调用评估中的共享行为方面存在差异。 jcast already explained why, but I want to add a shortcut that does not require explicitly desugaring the list comprehension. jcast已经解释了为什么,但是我想添加一个不需要明确地去除列表理解的快捷方式。 The rule is: any expression that is syntactically in a position where it could depend on a variable x will be recomputed each time the variable x is bound to a value, even if the expression does not actually depend on x . 规则是:每次将变量x绑定到某个值时,语法上处于可依赖于变量x的位置的任何表达式都将被重新计算,即使该表达式实际上不依赖于x

In your case, in the first definition, x is in scope in the position where cp xss appears, so cp xss will be re-evaluated for each element x of xs . 在您的情况下,在第一个定义中, xcp xss出现的位置的范围内,因此将对xs每个元素x重新计算cp xss In the second definition cp xss appears outside the scope of x so it will be computed just once. 在第二个定义中, cp xss出现在x的范围之外,因此它只会被计算一次。

Then the usual disclaimers apply, namely: 然后通常的免责声明适用,即:

  • The compiler is not required to adhere to the operational semantics of call-by-need evaluation, only the denotational semantics. 编译器不需要遵循按需调用评估的操作语义,只需遵循指称语义。 So it might compute things fewer times (floating out) or more times (floating in) than you would expect based on the above rule. 因此,根据上述规则,它可能会比您预期的更少次数(浮动)或更多次(浮动)。

  • It's not true in general that more sharing is better. 一般而言,更多共享更好是不正确的。 In this case, for example, it's probably not better because the size of cp xss grows as quickly as the amount of work that it took to compute it in the first place. 在这种情况下,例如,它可能不会更好,因为cp xss的大小增长速度与首先计算它的工作量一样快。 In this situation the cost of reading the value back from memory can exceed that of recomputing the value (due to the cache hierarchy and the GC). 在这种情况下,从内存中读取值的成本可能超过重新计算值的成本(由于缓存层次结构和GC)。

Well, a naive de-sugaring would be: 好吧,一个天真的脱糖将是:

cp [] = [[]]
cp (xs:xss) = concatMap (\x -> concatMap (\ ys -> [ x:ys ]) (cp xss)) xs
----------------------------------------------
cp [] = [[]]
cp (xs:xss) = let yss = cp xss in concatMap (\x -> concatMap (\ ys -> [ x:ys ]) yss) xs

As you can see, in the first version the call cp xss is inside a lambda. 如您所见,在第一个版本中,调用cp xss位于lambda中。 Unless the optimizer moves it, that means it will get re-evaluated each time the function \\x -> concatMap (\\ ys -> [ x:ys ]) (cp xss) gets called. 除非优化器移动它,否则每次调用函数\\x -> concatMap (\\ ys -> [ x:ys ]) (cp xss)都会重新评估它。 By floating it out, we avoid the re-computation. 通过浮动它,我们避免重新计算。

At the same time, GHC does have an optimization pass to float expensive computations out of loops like this, so it may convert the first version to the second automatically. 同时,GHC确实有一个优化传递来从这样的循环中浮动昂贵的计算,因此它可以自动将第一个版本转换为第二个版本。 Your book says the second version 'guarantees' to calculate the value of cp xss only once because, if the expression is expensive to compute, compilers will generally be very hesitant to inline it (converting the second version back into the first). 你的书说第二个版本'保证'只计算一次cp xss的值,因为如果表达式的计算成本很高,编译器通常会非常犹豫地内联它(将第二个版本转换回第一个版本)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM