简体   繁体   中英

Group list by equivalence relation

I have a equivalence relation R on a set A . How can I build equivalence classes on A ? It's something like groupBy do, but between all the elements, not only neighbors.

For example, equal is equivalence relation (it is reflexive, symmetric and transitive binary relation):

type Sometuple = (Int, Int, Int)

equal :: Sometuple -> Sometuple -> Bool
equal (_, x, _) (_, y, _) = x == y

It is actually a predicate that connect 2 Sometuple elements.

λ> equal (1,2,3) (1,2,2)
True

So, how can I build all equivalence classes on [Sometuple] based on equal predicate? Something like that:

equivalenceClasses :: (Sometuple -> Sometuple -> Bool) -> [Sometuple] -> [[Sometuple]]
λ> equivalenceClasses equal [(1,2,3), (2,1,4), (0,3,2), (9,2,1), (5,3,1), (1,3,1)]
[[(1,2,3),(9,2,1)],[(2,1,4)],[(0,3,2),(5,3,1),(1,3,2)]]

If you can define a compatible ordering relation, you can use

equivalenceClasses equal comp = groupBy equal . sortBy comp

which would give you O(n*log n) complexity. Without that, I don't see any way to get better complexity than O(n^2) , basically

splitOffFirstGroup :: (a -> a -> Bool) -> [a] -> ([a],[a])
splitOffFirstGroup equal xs@(x:_) = partition (equal x) xs
splitOffFirstGroup _     []       = ([],[])

equivalenceClasses _     [] = []
equivalenceClasses equal xs = let (fg,rst) = splitOffFirstGroup equal xs
                              in fg : equivalenceClasses equal rst

The correct data structure to use here is a disjoint set (Tarjan). A purely functional, persistent implementation of such a structure was described byConchon and Filliâtre . There's an implementation on Hackage .

Here's a slight variation of Daniel's suggestion:

Since equivalence classes partition a set of values (meaning that every value belongs to exactly one class), you can use a value to represent its class. In many cases, however, it is quite natural to choose one canonical representative per class. In your example, you might go for (0,x,0) representing the class { (0,0,0), (0,0,1), (1,0,0), (1,0,1), (2,0,0), ... } . You can therefore define a representative function as follows:

representative :: Sometuple -> Sometuple
representative (_,x,_) = (0,x,0)

Now, by definition, equal ab is the same as (representative a) == (representative b) . So if you sort a list of values by representative -- assuming we're dealing with members of Ord --, members of the same equivalence class end up next to each other and can be grouped by ordinary groupBy .

The function you were looking for thus becomes:

equivalenceClasses :: Ord a => (a -> a) -> [a] -> [[a]]
equivalenceClasses rep = groupBy ((==) `on` rep) . sortBy (compare `on` rep)

Daniel's suggestion is a generalisation of this approach. I'm essentially proposing a specific ordering relation (namely comparison by representative) that can be derived easily in many use cases.

Caveat 1: You need to make sure that representatives of the same/different equivalence classes are actually equal/different according to (==) and compare . If those two functions test for structural equality, this is always the case.

Caveat 2: Technically, you can relax the type of equivalenceClasses to

equivalenceClasses :: Ord b => (a -> b) -> [a] -> [[a]]

Others have noted that the problem is hard to do efficiently without some extra structure on the equivalence relation. If one recalls definitions from mathematics, an equivalence relation is equivalent to a quotient map (ie a function from your set to the equivalence classes). We can write a Haskell function which, given the quotient map (or rather something isomorphic to it) and some nice properties of it's codomain, groups by the equivalence relation. We can also define equivalence based on quotient maps.

import Data.Map

group :: Ord b => (a -> b) -> [a] -> [[a]]
group q xs = elems $ fromListWith (++) [(q x, [x]) | x <- xs]

sameClass :: Eq b => (a -> b) -> (a -> a -> Bool)
sameClass q a b = q a == q b

-- for your question
equal = sameClass (\(_,x,_) -> x)
group (\(_,x,_) -> x) [...]

This following solution performs a little faster than Daniel Fischer's on small data (lists shorter than about 2¹⁴ = 16384 elements) . It works by adding elements to equivalence classes one by one, creating a new class if an element does not belong to any of the existing ones.

module Classify where

import qualified Data.List as List

classify :: Eq a => [a] -> [[a]]
classify = classifyBy (==)

classifyBy :: (a -> a -> Bool) -> [a] -> [[a]]
classifyBy eq = List.foldl' f [ ]
  where
    f [ ] y = [[y]]
    f (xs@ (x: _): xss) y | x `eq` y  = (y: xs): xss
                          | otherwise = xs: f xss y

Turns out there is a similar function in GHC.Exts .

λ import GHC.Exts
λ groupWith snd [('a', 1), ('b', 2), ('c', 1)]
[[('a',1),('c',1)],[('b',2)]]

It requires you to define a function from your type to a type with a compatible Ord which Eq coincides with your notion of equivalence. ( snd here.) Categorially, you may see this function as an arrow onto the set of equivalence classes, also called a quotient map.

Thanks to Olaf for pointing out.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM