简体   繁体   中英

How to find most common element in list, and if there's a tie, the one who's last occurance is first?

Basically if given a list

events = [123,123,456,456,456,123]

I expect it returns 456 because 456 was last seen earlier than 123 was last seen.

I made lists comprised of the counts and indices of the initial list of numbers. I also made a dictionary in which the key is the element from events (original part) and hte value is the .count() of the key.

I don't really know where to go from here and could use some help.

Approach

Find the most frequently occurring items (Counter.most_common). Then find the item among those candidates that has the minimum index (enumerate into a dictionary of indexes, min of {index: key}.iteritems()).

Code

Stealing liberally from @gnibbler and @Jeff:

from collections import Counter

def most_frequent_first(events):
    frequencies = Counter(events)
    indexes = {event: i for i, event in enumerate(events)}
    most_frequent_with_indexes = {indexes[key]: key for key, _ in frequencies.most_common()}
    return min(most_frequent_with_indexes.iteritems())[1]

events = [123,123,456,456,456,123, 1, 2, 3, 2, 3]
print(most_frequent_first(events))

Result

>>> print(most_frequent_first(events))
456

Code

A better piece of code would provide you with the frequency and the index, showing you that the code is working correctly. Here is an implementation that uses a named_tuple:

from collections import Counter, namedtuple

frequent_first = namedtuple("frequent_first", ["frequent", "first"])

def most_frequent_first(events):
    frequencies = Counter(events)
    indexes = {event: i for i, event in enumerate(events)}
    combined = {key: frequent_first(value, indexes[key]) for key, value in frequencies.iteritems()}
    return min(combined.iteritems(), key=lambda t: (-t[1].frequent, t[1].first))

events = [123,123,456,456,456,123, 1, 2, 3, 2, 3]
print(most_frequent_first(events))

Result

>>> print(most_frequent_first(events))
(456, frequent_first(frequent=3, first=4))

Use collections.counter

>>> import collections

>>> events = [123,123,456,456,456,123]
>>> counts = collections.Counter(events)
>>> print counts
Counter({456: 3, 123: 3})
>>> mostCommon = counts.most_common()
>>> print mostCommon
[(456, 3), (123, 3)]

That's the hard part.

>>> from collections import Counter
>>> events = [123,123,456,456,456,123]
>>> c = Counter(events)
>>> idxs = {k: v for v,k in enumerate(events)}
>>> sorted(c.items(), key=lambda (k,v): (-v, idxs[k]))
[(456, 3), (123, 3)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM