简体   繁体   中英

Haskell groupBy function: How exactly does it work?

I experienced the following behavior:

ghci :m +Data.List

ghci> groupBy (\x y -> succ x == y) [1..6]
[[1,2], [3,4], [5,6]]
ghci> groupBy (\x y -> succ x /= y) [1..6]
[[1], [2], [3], [4], [5], [6]]

ghci :m +Data.Char   -- just a test to verify that nothing is broken with my ghc

ghci> groupBy (const isAlphaNum) "split this"
["split"," this"]

which surprised me, I thought, based on the example down below, that groupBy splits a list whenever the predicate evaluates to True for two successive elements supplied to a predicate. But in my second example above it splits the list on every element, but the predicate should evaluate to False . I framed my assumption of how it works in Haskell as well, just so everybody understands how I believed it to work:

groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy p l@(x:y:ys)
  | p x y     = (x:y:(head l)) ++ (tail l)       -- assuming l has a tail, unwilling to
  | otherwise = [x] ++ (y:(head l)) ++ (tail l)  -- to debug this right now, I guess you
groupBy _ [x] = [[x]]                            -- already got the hang of it ;)

Which brought me to the conclusion, that it works somewhat different. So my question is, how does that function actually work?

But in my second example above it splits the list on every element, but the predicate should evaluate to False .

In the second example it also evaluates every two consecutive elements. The function on which it works is const isAlphaNum . So that means that the type is:

const isAlphaNum :: b -> Char -> Bool

It thus calls the function with the beginning of the group and the element, but it takes only the second element into account .

So if we call it with: groupBy (const isAlphaNum) "split this" , it will evaluate:

succs   2nd    const isAlphaNum
-------------------------------
"sp"    'p'    True
"sl"    'l'    True
"si"    'i'    True
"st"    't'    True
"s "    ' '    False
" t"    't'    True
" h"    'h'    True
" i"    'i'    True
" s"    's'    True

Every time const isAlphaNum is True , it will append the character to the current sequence. So in case we evaluate "t " , const isAlphaNum , it will evaluate to False , groupBy will start a new group.

So here we thus construct two groups since there is only one False .

We can also obtain this result if we analyze the function source code :

 groupBy :: (a -> a -> Bool) -> [a] -> [[a]] groupBy _ [] = [] groupBy eq (x:xs) = (x:ys) : groupBy eq zs where (ys,zs) = span (eq x) xs 

So here groupBy will return the empty list if the given list is empty. In case it is not an empty list (x:xs) then we will construct a new sequence. The sequence starts with x , and contains furthermore all the elements of ys . ys is the first element of the 2-tuple constructed by span .

span :: (a -> Bool) -> [a] -> ([a],[a]) constructs a 2-tuple where the first element is the longest possible prefix of the list that satisfies the predicate, the predicate here is eq x so we keep adding elements to the group, as long as eq xy (with y an element) holds.

With the remaining part of the list ys , we construct a new group until the input is completely exhausted.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM