简体   繁体   中英

How to find top 3 frequency elements from a list of tuples?

d=[[(u'BAKING', 51)], [(u'ACCESS', 4)],[(u'CUTE', 2)], [(u'RED', 3)],[(u'FINE', 59)], [(u'ACCESS', 49)],[(u'YOU', 97)], [(u'THANK', 41)]]

I have a list of tuples with words and their corresponding frequency. Now how to find top 3 frequency words from these?

t=[]
for items in d:
k=items[0]
print len(k)
for j in k:
    t.append(j)
print t
m=[t[i:i+2] for i  in range(0, len(t), 2)]
print m
j=Counter(m)

This is giving me error, m is list it should be dictionary :( How to convert it into dictionary

You can use itemgetter and itertools.chain to get this task done:

from operator import itemgetter
from itertools import chain

sorted(list(chain.from_iterable(d)),  key=itemgetter(1), reverse=True)[0:3]

This will give you:

[(u'YOU', 97), (u'FINE', 59), (u'BAKING', 51)]

Some explanation: The chain command flattens your list of lists, so that you end up with a list of tuples (these might be easier to handle than the list of tuples). This list is then sorted according to the second element of the tuple using itemgetter and you then select the first three elements.

EDIT:

Just read your comment about multiple entries. One way to do it is the following:

import collections
from operator import itemgetter
from itertools import chain

result_dict = collections.defaultdict(list)
newL = list(chain.from_iterable(d))
for tu in newL:
     result_dict[tu[0]].append(tu[1])

This will give you

defaultdict(<type 'list'>, {u'CUTE': [2], u'BAKING': [51], u'THANK': [41], u'ACCESS': [4, 49], u'YOU': [97], u'FINE': [59], u'RED': [3]})

Now you can sum get the sum of the entries in the list like this:

res = {k: sum(v) for k,v in result_dict.iteritems()}

and the best three items like that:

sorted(res.iteritems(), key=itemgetter(1), reverse=True)[0:3]

In this case it is:

[(u'YOU', 97), (u'FINE', 59), (u'ACCESS', 53)]

我更喜欢:

sorted(d, key = lambda x: x[0][1], reverse = True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM