[英]Haskell groupBy function: How exactly does it work?
I experienced the following behavior: 我遇到以下行为:
ghci :m +Data.List
ghci> groupBy (\x y -> succ x == y) [1..6]
[[1,2], [3,4], [5,6]]
ghci> groupBy (\x y -> succ x /= y) [1..6]
[[1], [2], [3], [4], [5], [6]]
ghci :m +Data.Char -- just a test to verify that nothing is broken with my ghc
ghci> groupBy (const isAlphaNum) "split this"
["split"," this"]
which surprised me, I thought, based on the example down below, that groupBy
splits a list whenever the predicate evaluates to True
for two successive elements supplied to a predicate. 我认为,基于下面的示例,使我感到惊讶的是,只要提供给谓词的两个连续元素的谓词评估为
True
, groupBy
拆分一个列表。 But in my second example above it splits the list on every element, but the predicate should evaluate to False
. 但是在上面的第二个示例中,它在每个元素上拆分了列表,但是谓词应评估为
False
。 I framed my assumption of how it works in Haskell as well, just so everybody understands how I believed it to work: 我也假设了它在Haskell中的工作原理,所以每个人都理解我认为它是如何工作的:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy p l@(x:y:ys)
| p x y = (x:y:(head l)) ++ (tail l) -- assuming l has a tail, unwilling to
| otherwise = [x] ++ (y:(head l)) ++ (tail l) -- to debug this right now, I guess you
groupBy _ [x] = [[x]] -- already got the hang of it ;)
Which brought me to the conclusion, that it works somewhat different. 得出的结论是,它的工作原理有些不同。 So my question is, how does that function actually work?
所以我的问题是,该功能实际上如何工作?
But in my second example above it splits the list on every element, but the predicate should evaluate to
False
.但是在上面的第二个示例中,它在每个元素上拆分了列表,但是谓词应评估为
False
。
In the second example it also evaluates every two consecutive elements. 在第二个示例中, 它还评估每两个连续的元素。 The function on which it works is
const isAlphaNum
. 它起作用的函数是
const isAlphaNum
。 So that means that the type is: 所以这意味着类型是:
const isAlphaNum :: b -> Char -> Bool
It thus calls the function with the beginning of the group and the element, but it takes only the second element into account . 因此,它以组的开头和元素调用该函数,但是仅考虑第二个元素 。
So if we call it with: groupBy (const isAlphaNum) "split this"
, it will evaluate: 因此,如果我们用以下方式调用它:
groupBy (const isAlphaNum) "split this"
,它将评估:
succs 2nd const isAlphaNum
-------------------------------
"sp" 'p' True
"sl" 'l' True
"si" 'i' True
"st" 't' True
"s " ' ' False
" t" 't' True
" h" 'h' True
" i" 'i' True
" s" 's' True
Every time const isAlphaNum
is True
, it will append the character to the current sequence. 每次
const isAlphaNum
为True
,它将字符添加到当前序列中。 So in case we evaluate "t "
, const isAlphaNum
, it will evaluate to False
, groupBy
will start a new group. 因此,如果我们评估
"t "
, const isAlphaNum
,它将评估为False
, groupBy
将开始一个新的小组。
So here we thus construct two groups since there is only one False
. 因此,由于只有一个
False
因此在这里我们构造了两个组。
We can also obtain this result if we analyze the function source code : 如果我们分析函数源代码,我们也可以获得此结果:
groupBy :: (a -> a -> Bool) -> [a] -> [[a]] groupBy _ [] = [] groupBy eq (x:xs) = (x:ys) : groupBy eq zs where (ys,zs) = span (eq x) xs
So here groupBy
will return the empty list if the given list is empty. 因此,如果给定列表为空,则
groupBy
将返回空列表。 In case it is not an empty list (x:xs)
then we will construct a new sequence. 如果它不是一个空列表
(x:xs)
那么我们将构建一个新序列。 The sequence starts with x
, and contains furthermore all the elements of ys
. 该序列以
x
开头,并且还包含ys
所有元素。 ys
is the first element of the 2-tuple constructed by span
. ys
是span
构造的2元组的第一个元素。
span :: (a -> Bool) -> [a] -> ([a],[a])
constructs a 2-tuple where the first element is the longest possible prefix of the list that satisfies the predicate, the predicate here is eq x
so we keep adding elements to the group, as long as eq xy
(with y
an element) holds. span :: (a -> Bool) -> [a] -> ([a],[a])
构造一个2元组,其中第一个元素是满足谓词的列表中最长的前缀。是eq x
因此只要eq xy
(带有y
的元素)成立,我们就继续向组中添加元素。
With the remaining part of the list ys
, we construct a new group until the input is completely exhausted. 使用列表
ys
的其余部分,我们构造一个新的组,直到输入完全用尽。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.