简体   繁体   中英

Sort python dictionaries with 'value' as primary key and 'key' as secondary

What I am trying to do here is to display characters according to number of occurrences in a string in descending order. If two characters share the same number of occurrences, then they should be displayed as per the alphabetic order.

So given a string, 'abaddbccdd', what I want to display as output is: ['d', 'a', 'b', 'c']

Here is what I have done so far:

>>> from collections import Counter
>>> s = 'abaddbccdd'
>>> b = Counter(s)
>>> b
Counter({'d': 4, 'a': 2, 'c': 2, 'b': 2})
>>> b.keys()
['a', 'c', 'b', 'd']
>>> c = sorted(b, key=b.get, reverse=True)
>>> c
['d', 'a', 'c', 'b']
>>>

But how to handle the second part? 'a', 'b' and 'c' all appear in the text exactly twice and are out of order. What is the best way (hopefully shortest too) to do this?

The shortest way is:

>>> sorted(sorted(b), key=b.get, reverse=True)
['d', 'a', 'b', 'c']

So sort the sequence once in its natural order (the key order) then reverse sort on the values.

Note this won't have the fastest running time if the dictionary is large as it performs two full sorts, but in practice it is probably simplest because you want the values descending and the keys ascending.

The reason it works is that Python guarantees the sort to be stable. That means when the keys are equal the original order is preserved, so if you sort repeatedly from the last key back to the first you will get the desired result. Also reverse=True is different than just reversing the output as it also respects stability and only reverses the result where the keys are different.

This can be done in a single sorting pass. The trick is to do an ascending sort with the count numbers negated as the primary sorting key and the dictionary's key strings as the secondary sorting key.

b = {'d': 4, 'a': 2, 'c': 2, 'b': 2}
c = sorted(b, key=lambda k:(-b[k], k))
print(c)

output

['d', 'a', 'b', 'c']

If you are already using a Counter object, there is the Counter.most_common method. This will return a list of the items in order of highest to lowest frequency.

>>> b.most_common()
[('d', 4), ('a', 2), ('b', 2), ('c', 2)]

您可以使用lambda函数:

>>> sorted(b, key=lambda char: (b.get(char), 1-ord(char)), reverse=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM