简体   繁体   English

按等价关系分组列表

[英]Group list by equivalence relation

I have a equivalence relation R on a set A .我在集合A上有一个等价关系R How can I build equivalence classes on A ?如何在A上构建等价类? It's something like groupBy do, but between all the elements, not only neighbors.这有点像groupBy所做的,但在所有元素之间,而不仅仅是邻居。

For example, equal is equivalence relation (it is reflexive, symmetric and transitive binary relation):例如, equal是等价关系(它是自反、对称和传递的二元关系):

type Sometuple = (Int, Int, Int)

equal :: Sometuple -> Sometuple -> Bool
equal (_, x, _) (_, y, _) = x == y

It is actually a predicate that connect 2 Sometuple elements.它实际上是连接 2 个Sometuple元素的谓词。

λ> equal (1,2,3) (1,2,2)
True

So, how can I build all equivalence classes on [Sometuple] based on equal predicate?那么,如何基于equal谓词在[Sometuple]上构建所有等价类? Something like that:类似的东西:

equivalenceClasses :: (Sometuple -> Sometuple -> Bool) -> [Sometuple] -> [[Sometuple]]
λ> equivalenceClasses equal [(1,2,3), (2,1,4), (0,3,2), (9,2,1), (5,3,1), (1,3,1)]
[[(1,2,3),(9,2,1)],[(2,1,4)],[(0,3,2),(5,3,1),(1,3,2)]]

If you can define a compatible ordering relation, you can use如果您可以定义兼容的排序关系,则可以使用

equivalenceClasses equal comp = groupBy equal . sortBy comp

which would give you O(n*log n) complexity.这会给你O(n*log n)复杂度。 Without that, I don't see any way to get better complexity than O(n^2) , basically没有那个,我基本上看不到比O(n^2)更好的复杂性的方法

splitOffFirstGroup :: (a -> a -> Bool) -> [a] -> ([a],[a])
splitOffFirstGroup equal xs@(x:_) = partition (equal x) xs
splitOffFirstGroup _     []       = ([],[])

equivalenceClasses _     [] = []
equivalenceClasses equal xs = let (fg,rst) = splitOffFirstGroup equal xs
                              in fg : equivalenceClasses equal rst

The correct data structure to use here is a disjoint set (Tarjan).此处使用的正确数据结构是不相交集(Tarjan)。 A purely functional, persistent implementation of such a structure was described byConchon and Filliâtre . Conchon 和 Filliâtre描述了这种结构的纯函数式持久实现。 There's an implementation on Hackage .在 Hackage 上有一个实现。

Here's a slight variation of Daniel's suggestion:以下是 Daniel 建议的细微变化:

Since equivalence classes partition a set of values (meaning that every value belongs to exactly one class), you can use a value to represent its class.由于等价类划分一组值(意味着每个值只属于一个类),您可以使用一个值来表示它的类。 In many cases, however, it is quite natural to choose one canonical representative per class.然而,在许多情况下,为每个类选择一个规范代表是很自然的。 In your example, you might go for (0,x,0) representing the class { (0,0,0), (0,0,1), (1,0,0), (1,0,1), (2,0,0), ... } .在您的示例中,您可能会使用(0,x,0)表示类{ (0,0,0), (0,0,1), (1,0,0), (1,0,1), (2,0,0), ... } You can therefore define a representative function as follows:因此,您可以按如下方式定义代表性函数:

representative :: Sometuple -> Sometuple
representative (_,x,_) = (0,x,0)

Now, by definition, equal ab is the same as (representative a) == (representative b) .现在,根据定义, equal ab(representative a) == (representative b) So if you sort a list of values by representative -- assuming we're dealing with members of Ord --, members of the same equivalence class end up next to each other and can be grouped by ordinary groupBy .因此,如果您按代表对值列表进行排序——假设我们正在处理Ord成员——,同一等价类的成员最终会彼此相邻,并且可以按普通groupBy分组。

The function you were looking for thus becomes:您正在寻找的功能因此变为:

equivalenceClasses :: Ord a => (a -> a) -> [a] -> [[a]]
equivalenceClasses rep = groupBy ((==) `on` rep) . sortBy (compare `on` rep)

Daniel's suggestion is a generalisation of this approach. Daniel 的建议是对这种方法的概括。 I'm essentially proposing a specific ordering relation (namely comparison by representative) that can be derived easily in many use cases.我本质上提出了一个特定的排序关系(即代表比较),可以在许多用例中轻松推导出。

Caveat 1: You need to make sure that representatives of the same/different equivalence classes are actually equal/different according to (==) and compare .警告 1:根据(==)compare ,您需要确保相同/不同等价类的代表实际上是相等/不同的。 If those two functions test for structural equality, this is always the case.如果这两个函数测试结构相等,情况总是如此。

Caveat 2: Technically, you can relax the type of equivalenceClasses to警告 2:从技术上讲,您可以将equivalenceClasses的类型放宽到

equivalenceClasses :: Ord b => (a -> b) -> [a] -> [[a]]

Others have noted that the problem is hard to do efficiently without some extra structure on the equivalence relation.其他人已经指出,如果没有对等价关系的一些额外结构,这个问题很难有效地解决。 If one recalls definitions from mathematics, an equivalence relation is equivalent to a quotient map (ie a function from your set to the equivalence classes).如果回忆数学中的定义,等价关系就相当于商映射(即从您的集合到等价类的函数)。 We can write a Haskell function which, given the quotient map (or rather something isomorphic to it) and some nice properties of it's codomain, groups by the equivalence relation.我们可以编写一个 Haskell 函数,给定商映射(或者与其同构的东西)和它的 codomain 的一些很好的属性,根据等价关系分组。 We can also define equivalence based on quotient maps.我们也可以基于商映射来定义等价。

import Data.Map

group :: Ord b => (a -> b) -> [a] -> [[a]]
group q xs = elems $ fromListWith (++) [(q x, [x]) | x <- xs]

sameClass :: Eq b => (a -> b) -> (a -> a -> Bool)
sameClass q a b = q a == q b

-- for your question
equal = sameClass (\(_,x,_) -> x)
group (\(_,x,_) -> x) [...]

This following solution performs a little faster than Daniel Fischer's on small data (lists shorter than about 2¹⁴ = 16384 elements) .以下解决方案在小数据(列表少于大约 2¹⁴ = 16384 个元素)上的执行速度比 Daniel Fischer 的略快。 It works by adding elements to equivalence classes one by one, creating a new class if an element does not belong to any of the existing ones.它的工作原理是将元素一个一个地添加到等价类中,如果一个元素不属于任何现有元素,则创建一个新类。

module Classify where

import qualified Data.List as List

classify :: Eq a => [a] -> [[a]]
classify = classifyBy (==)

classifyBy :: (a -> a -> Bool) -> [a] -> [[a]]
classifyBy eq = List.foldl' f [ ]
  where
    f [ ] y = [[y]]
    f (xs@ (x: _): xss) y | x `eq` y  = (y: xs): xss
                          | otherwise = xs: f xss y

Turns out there is a similar function in GHC.Exts . 原来在GHC.Exts有一个类似的功能。

λ import GHC.Exts
λ groupWith snd [('a', 1), ('b', 2), ('c', 1)]
[[('a',1),('c',1)],[('b',2)]]

It requires you to define a function from your type to a type with a compatible Ord which Eq coincides with your notion of equivalence.它要求您将函数从您的类型定义为具有兼容Ord的类型,而Eq与您的等价概念一致。 ( snd here.) Categorially, you may see this function as an arrow onto the set of equivalence classes, also called a quotient map. (这里是snd 。)在分类上,您可以将此函数视为指向等价类集的箭头,也称为商映射。

Thanks to Olaf for pointing out. 感谢奥拉夫指出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM