Grouping data types by constructor in Haskell

Question

Given this data type

data Val = X Int | Y Bool | Z Double deriving (Eq, Show)

and a list such as

let vals = [X 1, Z 2.7, Y True, X 2, Z 3.14, Y True]

how to group elements in vals into this list,

[[X 1,X 2],[Y True,Y True],[Z 2.7, Z 3.14]]

Answer 1

To add to @RamonSnir's answer, the function for grouping a data type by constructors can be also constructed automatically using the "Scrap your boilerplate" framework:

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Data
import Data.Function (on)
import Data.List (groupBy, sort)

data Val = X Int | Y Bool | Z Double
  deriving (Eq, Ord, Show, Typeable, Data)

vals :: [Val]
vals = [X 1, Z 2.7, Y True, X 2, Z 3.14, Y True]

main :: IO ()
main = print $ groupBy (on (==) toConstr) $ sort vals

The two important parts are:

derive Typeable and Data , and
use toConstr to get the representation of the constructor used in a particular value.

Answer 2

I've the following:

data Val = X Int | Y Bool | Z Double deriving (Eq, Ord, Show)

vals :: [Val]
vals = [X 1, Z 2.7, Y True, X 2, Z 3.14, Y True]

valCtorEq :: Val -> Val -> Bool
valCtorEq (X _) (X _) = True
valCtorEq (Y _) (Y _) = True
valCtorEq (Z _) (Z _) = True
valCtorEq _ _ = False

And then:

*Main Data.List> groupBy valCtorEq $ sort vals
[[X 1,X 2],[Y True,Y True],[Z 2.7,Z 3.14]]

Answer 3

(This is probably extreme overkill, but it was a fun question to tinker around with!)

The provided answers suffer from 3 slight problems:

What if the type under consideration isn't in Ord (because for example, there's a function in there somewhere)?
Also, should this operation be O(n log n) in the length of the list?
Finally, the example provided isn't sufficient to determine whether the grouping should be stable, that is: should the result of grouping [X 2, X 1] be [X 1, X 2] (that's what you get if you use the sort -based solutions) or should the elements be kept in their original order?

So here is the most general solution I could come up with. It's stable, it doesn't need Ord (in fact you don't even need to touch the original datatype) and it runs in about O(n * min(n,W)) time where W is the word size of your machine (on mine, it's 64). That is, it's linear once the list gets longer than 64-ish elements (I say 'about', because the grouped elements still need to be reconstituted from the difference lists).

{-# LANGUAGE DeriveDataTypeable, StandaloneDeriving #-} 

import Data.Data
import qualified Data.IntMap as IM

groupByConstructor :: Data a => [a] -> [[a]]
groupByConstructor = map ($ []) . IM.elems . IM.fromListWith (flip (.)) 
    . map (\a -> (constrIndexOf a, (a:))) where constrIndexOf = constrIndex . toConstr

-- definition of Val as originally posed, without Ord:
data Val = X Int | Y Bool | Z Double deriving (Eq, Show)

deriving instance Typeable Val
deriving instance Data Val

-- new example:
vals = [X 2, Z 2.7, Y True, X 1, Z 3.14, Y False, Z 0.2]

and now groupByConstructor vals gives [[X 2, X 1],[Y True, Y False],[Z 2.7, Z 3.14, Z 0.2]] as I think it should.

It doesn't work for sorting lists of Int s, Char s, Float s, or non-representable types such as Ptr and Array . It could probably be made a bit more efficient by using an algorithm which uses the number of possible constructors to push the linear constant down further but IntMap will have to do for now :-)

Grouping data types by constructor in Haskell

Question

3 answers

solution1
13 2014-10-22 06:47:39

solution2
7 ACCPTED 2014-10-22 06:04:02

solution3
1 2014-10-22 19:36:22

Grouping data types by constructor in Haskell

Question

3 answers

solution1 13 2014-10-22 06:47:39

solution2 7 ACCPTED 2014-10-22 06:04:02

solution3 1 2014-10-22 19:36:22

solution1
13 2014-10-22 06:47:39

solution2
7 ACCPTED 2014-10-22 06:04:02

solution3
1 2014-10-22 19:36:22