简体   繁体   中英

Finding the most common element in a list of lists

I am given a list of lists like this:

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)], 
         [(2, 2), (2, 1)], 
         [(1, 1), (1, 2), (2, 2), (2, 1)]]

and the desired output is: {(2,2)} .

I need to find the most frequent element(s). It has to return more than one value if there are elements that are repeated just as many times.

I tried solving it with an intersection of three lists, but it prints out {(2,1), (2,2)} , instead of {(2,2)} , since the element (2,2) is repeated two times in the first list.

I saw a few examples with import collections , but I don't understand them so I don't know how to change the code to be suitable for my problem.

I also tried the following:

seen = set()
repeated = set()
for l in pairs:
    for i in set(l):
        if i in seen:
            repeated.add(i)
        if i in repeated:
            repeated.add(i)
        else:
            seen.add(i)

but still doesn't return the correct answer.

Alternative solution without using the Counter method from collections :

def get_freq_tuple(data):
    counts = {}
    max_count = 0
    for pairs in data:
        for pair in pairs:
            current_count = counts.get(pair, 0) + 1
            counts[pair] = current_count
            max_count = max(max_count, current_count)

    return [pair for pair in counts if counts[pair] == max_count]

if __name__ == "__main__":
    pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1),
              (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
             [(2, 2), (2, 1)],
             [(1, 1), (1, 2), (2, 2), (2, 1)]]
    print(get_freq_tuple(pairs))

Output:

[(2, 2)]

Explanation:

  • Count the occurrences of each tuple and store them in a dictionary. The key of the dictionary is the tuple and the value is the occurrence.
  • Filter the tuples in the dictionary by maximum occurrences of the tuples.

Disclaimer:

  • Using Counter method from collections is much more efficient.

References:

collections.Counter() will work...you just need to figure out how to pass all the pairs in the nested list, which you can do with a list comprehension. For example:

from collections import Counter

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)], 
         [(2, 2), (2, 1)], 
         [(1, 1), (1, 2), (2, 2), (2, 1)]]

counts = Counter(pair for l in pairs for pair in l)
counts.most_common(1)
# [((2, 2), 4)]

If you have a tie, you will need to look through the top choices and pick off the ones that have the same count. You can get the sorted list by looking at counts.most_common() .

itertools.groupby is a common way to deal with this. For example, if you had a tie, you could get all the top entries like:

from collections import Counter
from itertools import groupby

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)], 
         [(2, 2), (2, 1)], 
         [(1, 1), (1, 2), (2, 2), (2, 1), (3, 2)]]

counts = Counter(pair for l in pairs for pair in l)

count, groups = next(groupby(counts.most_common(), key=lambda t: t[1]))
[g[0] for g in groups]
# [(2, 2), (3, 2)]

You can use collections.Counter :

import collections

pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
         [(2, 2), (2, 1)],
         [(1, 1), (1, 2), (2, 2), (2, 1)]]

most_common = collections.Counter(tup for sublst in pairs for tup in sublst).most_common(1)
print(most_common) # [((2, 2), 4)]
print(*(tup[0] for tup in most_common)) # only the tuples: (2, 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM