简体   繁体   English

如何比较haskell列表中的内容,看它们是否部分相同?

[英]How to compare contents in a list in haskell, to see if they are partially same?

This is very tricky for me. 这对我来说非常棘手。

Given a very long list, 清单很长,

[[100,11,1,0,40,1],[100,12,3,1,22,23],[101,11,1,0,45,1],[101,12,3,1,28,30],[102,11,1,0,50,1],[102,12,3,1,50,50],[103,11,1,0,50,1],[103,12,3,1,50,50],[104,11,1,0,50,25],[104,12,3,1,50,50],[105,11,1,0,50,49],[105,12,3,0,30,30],[106,11,1,0,50,49],[106,12,3,0,25,26],[107,11,1,1,33,20],[107,12,3,0,25,26],[108,11,1,1,2,1],[108,12,3,1,20,24],[109,11,1,1,2,1],[109,12,3,1,28,31],[110,11,1,0,40,1],[110,12,3,1,22,23]..] [[100,11,1,0,40,1],[100,12,3,1,22,23],[101,11,1,0,45,1],[101,12,3, 1,28,30],[102,11,1,0,50,1],[102,12,3,1,50,50],[103,11,1,0,50,1],[ 103,12,3,1,50,50],[104,11,1,0,50,25],[104,12,3,1,50,50],[105,11,1,0, 50,49],[105,12,3,0,30,30],[106,11,1,0,50,49],[106,12,3,0,25,26],[107 11,1,1,33,20],[107,12,3,0,25,26],[108,11,1,1,2,1],[108,12,3,1,20, 24],[109,11,1,1,2,1],[109,12,3,1,28,31],[110,11,1,0,40,1],[110,12, 3,1,22,23] ..]

Now igore the first number of each list, if two lists are the same in spite of the first number, for example 现在,如果两个列表尽管第一个数字相同,但仍然要绕过每个列表的第一个数字

[101,11,1,0,50,1] alike [102,11,1,0,50,1] [101,11,1,0,50,1] alike [102,11,1,0,50,1]

We keep the latter list, until the whole list is all checked. 我们保留后一个列表,直到整个列表都被选中为止。 Ideally the result should be like : 理想情况下,结果应为:

[[102,11,1,0,50,1],[103,12,3,1,50,50]..] [[102,11,1,0,50,1],[103,12,3,1,50,50] ..

My idea is to use map to take the first number away, use nub and \\\\ to get rid of all repeat results, make it into like 我的想法是使用map删除第一个数字,使用nub和\\\\摆脱所有重复的结果,使其像

[[11,1,0,50,1],[12,3,1,50,50],[12,3,1,50,50],[11,1,0,50,49],[12,3,0,25,26],[11,1,1,2,1],[11,1,0,40,1],[12,3,1,22,23],[11,1,0,45,1],[12,3,1,28,30]..] [[11,1,0,50,1],[12,3,1,50,50],[12,3,1,50,50],[11,1,0,50,49],[ 12,3,0,25,26],[11,1,1,2,1],[11,1,0,40,1],[12,3,1,22,23],[11, 1,0,45,1],[12,3,1,28,30] ..

and use Set.isSubsetOf to filter the orignal list out. 并使用Set.isSubsetOf过滤掉原始列表。

However, this idea is too complex and difficult to implenment. 但是,这个想法太复杂了,很难实现。 Since I am a beginner of Haskell, is there a better way to sort this? 由于我是Haskell的初学者,有没有更好的方法来解决此问题? Or I can use a recursion function instead(still need efforts though)? 或者我可以改用递归函数(尽管仍然需要努力)?

From your description, I take it you want to get a list of the last lists to possess each of the unique tails. 从您的描述中,我认为您希望获得包含所有唯一尾巴的最后一个列表。 You can do it like this: 您可以这样做:

lastUniqueByTail :: Eq a => [[a]] -> [[a]] 
lastUniqueByTail = reverse . nubBy ((==) `on` tail) . reverse

Note this only works for finite lists of non-empty lists 请注意, 这仅适用于非空列表的有限列表

You can find on in the Data.Function module, and you can find nubBy in Data.List . 您可以在Data.Function模块中找到on ,也可以在nubBy中找到Data.List

So here's an explanation on each part, we'll work from the inside out: 因此,这是每个部分的解释,我们将从内而外地进行工作:

  • ((==) `on` tail) This function performs comparisons between two lists, and determines them to be equal if their tails are equal (ie it performs an equality check ignoring the first element). ((==) `on` tail)此函数在两个列表之间进行比较,如果它们的尾部相等,则确定它们相等(即,执行相等性检查,忽略第一个元素)。

    The on function is what is doing most of the magic here. on功能是执行此处大多数操作的功能。 It has the following type signature: 它具有以下类型签名:

     on :: (b -> b -> c) -> (a -> b) -> a -> a -> c 

    And it is essentially defined as (f `on` g) xy = f (gx) (gy) , so substituting the functions we provided, we get ((==) `on` tail) xy = (tail x) == (tail y) . 它的基本定义为(f `on` g) xy = f (gx) (gy) ,因此代入我们提供的函数,我们得到((==) `on` tail) xy = (tail x) == (tail y)

  • Then, nubBy is like nub except that you can provide it a comparator to test equality on. 然后, nubBy类似于nub不同之处在于您可以为其提供比较器以测试是否相等。 We give it the function we defined above so that it discards elements with the same tail. 我们赋予它上面定义的功能,以便它丢弃具有相同尾巴的元素。

  • But, like nub , nubBy keeps the first element in each equivalence class it finds. 但是,像nub一样, nubBy会在找到的每个等效类中保留第一个元素。 (ie if it finds two elements that are equal in the list, it always picks the first). (即,如果找到列表中相等的两个元素,则始终选择第一个)。 We want the last such elements, which is why we must reverse first (so that the last element that would have been equal becomes the first, and so is kept). 我们想要最后一个这样的元素,这就是为什么我们必须首先反转(以便使本来相等的最后一个元素成为第一个,因此被保留)。
  • Finally, we reverse at the end to get the order back how it was to begin with. 最后,我们在最后进行反向操作以使订单重新开始。

If you need to compact a list by merging all items with same tail, try this code. 如果您需要通过合并所有具有相同尾巴的项目来压缩列表,请尝试以下代码。

compactList:: [[Int]] -> [[Int]]
compactList list = reverse $ compactList' list [] 

compactList':: [[Int]] -> [[Int]] -> [[Int]]
compactList' [] res = res
compactList' (l:ls) res
    | inList l res
    = compactList' ls res
    | otherwise
    = compactList' ls  (l:res)

inList :: [Int] -> [[Int]] -> Bool
inList [] _ = False
inList _ [] = False
inList val@(x:xs) ((x':xs'):ls)
    | xs == xs'
    = True
    | otherwise
    = inList val ls

If we need "keep the latter list" (I missed that previously), then just change the place of reverse 如果我们需要“保留后面的列表”(我之前错过了),那么只需更改reverse

compactList list = reverse $ compactList' list []

You can just find the Euclidian distance between 2 vectors. 您可以找到两个向量之间的欧几里得距离。 For example, if you have two lists [a0, a1, ... , an] and [b0, b1, ..., bn], then the square of the Euclidian distance between them would be 例如,如果您有两个列表[a0,a1,...,an]和[b0,b1,...,bn],则它们之间的欧几里得距离的平方将是

sum [ (a - b) ^ 2 | (a, b) <- zip as bs ]

This can give you an idea of how close one list is to another. 这可以使您了解一个列表与另一个列表的接近程度 Ideally you should take the square root, but for what you are doing, I don't know if that is necessary. 理想情况下,您应该扎根,但是对于您正在做的事情,我不知道这是否必要。

Edit: 编辑:

Sorry I misunderstood your question. 对不起,我误解了你的问题。 According to your definition, alike is defined so: 根据你的定义, alike被定义这样:

alike as bs = (tail as) == (tail bs)

If that is the case, then something like the following should do the trick: 如果真是这样,那么应该执行以下操作:

xAll [] = []
xAll xa@(x:xaa) = a : xAll b
    where
        a = last $ filter (alike x) xa
        b = filter (\m -> not (alike x m)) xa

But not sure if this is what you are looking for ... 但是不确定这是否是您要寻找的...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM