简体   繁体   English


[英]Finding the most common element in a list of lists

I am given a list of lists like this:我得到了一个这样的列表列表:

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)], 
         [(2, 2), (2, 1)], 
         [(1, 1), (1, 2), (2, 2), (2, 1)]]

and the desired output is: {(2,2)} .并且所需的 output 是: {(2,2)}

I need to find the most frequent element(s).我需要找到最常见的元素。 It has to return more than one value if there are elements that are repeated just as many times.如果有重复多次的元素,它必须返回一个以上的值。

I tried solving it with an intersection of three lists, but it prints out {(2,1), (2,2)} , instead of {(2,2)} , since the element (2,2) is repeated two times in the first list.我尝试用三个列表的交集来解决它,但它打印出{(2,1), (2,2)} ,而不是{(2,2)} ,因为元素(2,2)重复了两次在第一个列表中的次数。

I saw a few examples with import collections , but I don't understand them so I don't know how to change the code to be suitable for my problem.我看到了一些带有import collections的示例,但我不理解它们,所以我不知道如何更改代码以适合我的问题。

I also tried the following:我还尝试了以下方法:

seen = set()
repeated = set()
for l in pairs:
    for i in set(l):
        if i in seen:
        if i in repeated:

but still doesn't return the correct answer.但仍然没有返回正确的答案。

Alternative solution without using the Counter method from collections :不使用collections中的Counter方法的替代解决方案:

def get_freq_tuple(data):
    counts = {}
    max_count = 0
    for pairs in data:
        for pair in pairs:
            current_count = counts.get(pair, 0) + 1
            counts[pair] = current_count
            max_count = max(max_count, current_count)

    return [pair for pair in counts if counts[pair] == max_count]

if __name__ == "__main__":
    pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1),
              (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
             [(2, 2), (2, 1)],
             [(1, 1), (1, 2), (2, 2), (2, 1)]]

Output: Output:

[(2, 2)]


  • Count the occurrences of each tuple and store them in a dictionary.计算每个元组的出现次数并将它们存储在字典中。 The key of the dictionary is the tuple and the value is the occurrence.字典的键是元组,值是出现。
  • Filter the tuples in the dictionary by maximum occurrences of the tuples.按元组的最大出现次数过滤字典中的元组。


  • Using Counter method from collections is much more efficient.使用collections中的Counter方法效率更高。


collections.Counter() will work...you just need to figure out how to pass all the pairs in the nested list, which you can do with a list comprehension. collections.Counter()将起作用......您只需要弄清楚如何传递嵌套列表中的所有对,您可以通过列表理解来完成。 For example:例如:

from collections import Counter

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)], 
         [(2, 2), (2, 1)], 
         [(1, 1), (1, 2), (2, 2), (2, 1)]]

counts = Counter(pair for l in pairs for pair in l)
# [((2, 2), 4)]

If you have a tie, you will need to look through the top choices and pick off the ones that have the same count.如果您有平局,则需要查看排名靠前的选项并挑选出具有相同数量的选项。 You can get the sorted list by looking at counts.most_common() .您可以通过查看counts.most_common()来获取排序列表。

itertools.groupby is a common way to deal with this. itertools.groupby是处理此问题的常用方法。 For example, if you had a tie, you could get all the top entries like:例如,如果您有平局,您可以获得所有排名靠前的条目,例如:

from collections import Counter
from itertools import groupby

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)], 
         [(2, 2), (2, 1)], 
         [(1, 1), (1, 2), (2, 2), (2, 1), (3, 2)]]

counts = Counter(pair for l in pairs for pair in l)

count, groups = next(groupby(counts.most_common(), key=lambda t: t[1]))
[g[0] for g in groups]
# [(2, 2), (3, 2)]

You can use collections.Counter :您可以使用collections.Counter

import collections

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
         [(2, 2), (2, 1)],
         [(1, 1), (1, 2), (2, 2), (2, 1)]]

most_common = collections.Counter(tup for sublst in pairs for tup in sublst).most_common(1)
print(most_common) # [((2, 2), 4)]
print(*(tup[0] for tup in most_common)) # only the tuples: (2, 2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM