简体   繁体   中英

sorted dict returns doesn't make sense / python

I have been debugging a program I have been trying to get running correctly, and it all comes down to the first line of code as below. I have a dictionary of keys (string) and values (integer). I am trying to sort these in place in ascending order so that I can get the smallest element. However the order of the returned values doesn't make sense. It is not returning the smallest first, in fact it returns the largest (even though documentation clearly says it should be in ascending order). It is not in alphabetical order either, even though the first string is AAAAA..AAA - but it doesn't follow an alphabetical order. I am not even sure how to access the elements of what is returned by sorted and what type it is. According to the errors I have received while experimenting, it is a "list". How can I solve this problem? This one line is making the computation all wrong.

    kmerMin = (sorted(topkmerdict,key=lambda x: x[1]))
    print kmerMin
    print kmerMin[0]
    print topkmerdict[kmerMin[0]]

What you wrote returns a list of the dict's keys, sorted by the value of the 2nd character of the keys ( x[1] picks out the 2nd character of key x ).

It's not clear you want instead. If, eg, you want the keys sorted by their associated values , then here's one way to do it:

>>> d = {"a": 12, "b": 6, "c": 3}
>>> sorted(d, key=lambda k: d[k])
['c', 'b', 'a']

Another way to do exactly the same, faster but perhaps less obvious:

>>> sorted(d, key=d.__getitem__)
['c', 'b', 'a']

I have a dictionary of keys (string) and values (integer). I am trying to sort these in place in ascending order so that I can get the smallest element.

You can make a list of keys sorted by the values by using __getitem__ as the key-function :

>>> topkmerdict = {
       'ATCCCAGCACTTTGGGAGGCCGAGGCAGGT': 6,
       'CTGTAATCCCAGCACTTTGGGAGGCCGAGG': 71,
       'AGCACTTTGGGAGGCCGAGGCAGGTGGATC': 8,
       'GGTGGCTCACGCCTGTAATCCCAGCACTTT': 53,
       'TGTTTGAGTTCATTGTAGATTCTGGATATT': 8,
       'CGGTGGCTCACGCCTGTAATCCCAGCACTT': 40,
       'GTAATCCCAGCACTTTGGGAGGCCGAGGCA': 27,
}
>>> kmerMin = sorted(topkmerdict, key=topkmerdict.__getitem__)
>>> kmerMin[0]
'ATCCCAGCACTTTGGGAGGCCGAGGCAGGT'
>>> topkmerdict[kmerMin[0]]
6

If writing __getitem__ looks too weird, you can still use a lambda for the key-function: kmerMin = sorted(topkmerdict, key=lambda k: topkmerdict[k]) .

FYI, if all you want is the smallest element, there is no need to do a full sort. The min() function would be cleaner, clearer, and much more efficient:

>>> min(topkmerdict, key=topkmerdict.__getitem__)
'ATCCCAGCACTTTGGGAGGCCGAGGCAGGT'

I think you will see interesting results by changing:

key=lambda x: x[1]

to:

key=lambda x: x

This will sort your keys based on the entire key and not just the second character.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM