简体   繁体   English


[英]How to sort objects by multiple keys in Python?

Or, practically, how can I sort a list of dictionaries by multiple keys?或者,实际上,如何按多个键对字典列表进行排序?

I have a list of dicts:我有一个字典列表:

b = [{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Foster, Toney', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lawson, Roman', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lempke, Sam', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Gnezda, Alex', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Kirks, Damien', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Worden, Tom', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Korecz, Mike', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Swartz, Brian', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Burgess, Randy', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Smugala, Ryan', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Harmon, Gary', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Blasinsky, Scott', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Carter III, Laymon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Coleman, Johnathan', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Venditti, Nick', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Blackwell, Devon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Kovach, Alex', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Bolden, Antonio', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Smith, Ryan', u'Total_Points': 60.0}]

and I need to use a multi key sort reversed by Total_Points, then not reversed by TOT_PTS_Misc .我需要使用由 Total_Points 反转的多键排序,然后不由TOT_PTS_Misc反转。

This can be done at the command prompt like so:这可以在命令提示符下完成,如下所示:

a = sorted(b, key=lambda d: (-d['Total_Points'], d['TOT_PTS_Misc']))

But I have to run this through a function, where I pass in the list and the sort keys.但是我必须通过一个函数来运行它,在那里我传入列表和排序键。 For example, def multikeysort(dict_list, sortkeys): .例如, def multikeysort(dict_list, sortkeys):

How can the lambda line be used which will sort the list, for an arbitrary number of keys that are passed in to the multikeysort function, and take into consideration that the sortkeys may have any number of keys and those that need reversed sorts will be identified with a '-' before it?如何使用 lambda 行对列表进行排序,对于传入 multikeysort 函数的任意数量的键,并考虑到 sortkeys 可能具有任意数量的键,并且将识别需要反向排序的键前面有'-'吗?

This answer works for any kind of column in the dictionary -- the negated column need not be a number.此答案适用于字典中的任何类型的列——否定列不必是数字。

def multikeysort(items, columns):
    from operator import itemgetter
    comparers = [((itemgetter(col[1:].strip()), -1) if col.startswith('-') else
                  (itemgetter(col.strip()), 1)) for col in columns]
    def comparer(left, right):
        for fn, mult in comparers:
            result = cmp(fn(left), fn(right))
            if result:
                return mult * result
            return 0
    return sorted(items, cmp=comparer)

You can call it like this:你可以这样称呼它:

b = [{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 96.0},
     {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0},
     {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0},
     {u'TOT_PTS_Misc': u'Foster, Toney', u'Total_Points': 80.0},
     {u'TOT_PTS_Misc': u'Lawson, Roman', u'Total_Points': 80.0},
     {u'TOT_PTS_Misc': u'Lempke, Sam', u'Total_Points': 80.0},
     {u'TOT_PTS_Misc': u'Gnezda, Alex', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Kirks, Damien', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Worden, Tom', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Korecz, Mike', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Swartz, Brian', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Burgess, Randy', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Smugala, Ryan', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Harmon, Gary', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Blasinsky, Scott', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Carter III, Laymon', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Coleman, Johnathan', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Venditti, Nick', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Blackwell, Devon', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Kovach, Alex', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Bolden, Antonio', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Smith, Ryan', u'Total_Points': 60.0}]

a = multikeysort(b, ['-Total_Points', 'TOT_PTS_Misc'])
for item in a:
    print item

Try it with either column negated.尝试否定任一列。 You will see the sort order reverse.您将看到排序顺序相反。

Next: change it so it does not use extra class....下一步:更改它,使其不使用额外的类....

2016-01-17 2016-01-17

Taking my inspiration from this answer What is the best way to get the first item from an iterable matching a condition?从这个答案中汲取灵感 从匹配条件的可迭代对象中获取第一个项目的最佳方法是什么? , I shortened the code: ,我缩短了代码:

from operator import itemgetter as i

def multikeysort(items, columns):
    comparers = [
        ((i(col[1:].strip()), -1) if col.startswith('-') else (i(col.strip()), 1))
        for col in columns
    def comparer(left, right):
        comparer_iter = (
            cmp(fn(left), fn(right)) * mult
            for fn, mult in comparers
        return next((result for result in comparer_iter if result), 0)
    return sorted(items, cmp=comparer)

In case you like your code terse.如果您喜欢简洁的代码。

Later 2016-01-17稍后 2016-01-17

This works with python3 (which eliminated the cmp argument to sort ):这适用于 python3(它消除了sortcmp参数):

from operator import itemgetter as i
from functools import cmp_to_key

def cmp(x, y):
    Replacement for built-in function cmp that was removed in Python 3

    Compare the two objects x and y and return an integer according to
    the outcome. The return value is negative if x < y, zero if x == y
    and strictly positive if x > y.


    return (x > y) - (x < y)

def multikeysort(items, columns):
    comparers = [
        ((i(col[1:].strip()), -1) if col.startswith('-') else (i(col.strip()), 1))
        for col in columns
    def comparer(left, right):
        comparer_iter = (
            cmp(fn(left), fn(right)) * mult
            for fn, mult in comparers
        return next((result for result in comparer_iter if result), 0)
    return sorted(items, key=cmp_to_key(comparer))

Inspired by this answer How should I do custom sort in Python 3?受此答案启发我应该如何在 Python 3 中进行自定义排序?

This article has a nice rundown on various techniques for doing this. 本文对执行此操作的各种技术进行了很好的概述。 If your requirements are simpler than "full bidirectional multikey", take a look.如果您的要求比“全双向多键”更简单,请查看。 It's clear the accepted answer and the blog post I just referenced influenced each other in some way, though I don't know which order.很明显,接受的答案和我刚刚引用的博客文章在某种程度上相互影响,尽管我不知道哪个顺序。

In case the link dies here's a very quick synopsis of examples not covered above:如果链接失效,这里有一个上面未涵盖的示例的非常快速的概要:

mylist = sorted(mylist, key=itemgetter('name', 'age'))
mylist = sorted(mylist, key=lambda k: (k['name'].lower(), k['age']))
mylist = sorted(mylist, key=lambda k: (k['name'].lower(), -k['age']))

I know this is a rather old question, but none of the answers mention that Python guarantees a stable sort order for its sorting routines such as list.sort() and sorted() , which means items that compare equal retain their original order.我知道这是一个相当古老的问题,但没有一个答案提到 Python 保证其排序例程的稳定排序顺序,例如list.sort()sorted() ,这意味着比较相等的项目保留其原始顺序。

This means that the equivalent of ORDER BY name ASC, age DESC (using SQL notation) for a list of dictionaries can be done like this:这意味着字典列表的ORDER BY name ASC, age DESC (使用 SQL 符号)的等价物可以这样完成:

items.sort(key=operator.itemgetter('age'), reverse=True)

Note how the items are first sorted by the "lesser" attribute age (descending), then by the "major" attribute name , leading to the correct final order.请注意项目如何首先按“较小”属性age (降序)排序,然后按“主要”属性name排序,从而得出正确的最终顺序。

The reversing/inverting works for all orderable types, not just numbers which you can negate by putting a minus sign in front.反转/反转适用于所有可排序类型,而不仅仅是您可以通过在前面放置减号来否定的数字。

And because of the Timsort algorithm used in (at least) CPython, this is actually rather fast in practice.而且由于(至少)CPython 中使用了 Timsort 算法,这实际上在实践中相当快。

def sortkeypicker(keynames):
    negate = set()
    for i, k in enumerate(keynames):
        if k[:1] == '-':
            keynames[i] = k[1:]
    def getit(adict):
       composite = [adict[k] for k in keynames]
       for i, (k, v) in enumerate(zip(keynames, composite)):
           if k in negate:
               composite[i] = -v
       return composite
    return getit

a = sorted(b, key=sortkeypicker(['-Total_Points', 'TOT_PTS_Misc']))

I had a similar issue today - I had to sort dictionary items by descending numeric values and by ascending string values.我今天遇到了类似的问题 - 我必须通过降序数值和升序字符串值对字典项进行排序。 To solve the issue of conflicting directions, I negated the integer values.为了解决方向冲突的问题,我否定了整数值。

Here's a variant of my solution - as applicable to OP这是我的解决方案的一个变体 - 适用于 OP

sorted(b, key=lambda e: (-e['Total_Points'], e['TOT_PTS_Misc']))

Very simple - and works like a charm非常简单 - 就像一个魅力

[{'TOT_PTS_Misc': 'Chappell, Justin', 'Total_Points': 96.0},
 {'TOT_PTS_Misc': 'Russo, Brandon', 'Total_Points': 96.0},
 {'TOT_PTS_Misc': 'Utley, Alex', 'Total_Points': 96.0},
 {'TOT_PTS_Misc': 'Foster, Toney', 'Total_Points': 80.0},
 {'TOT_PTS_Misc': 'Lawson, Roman', 'Total_Points': 80.0},
 {'TOT_PTS_Misc': 'Lempke, Sam', 'Total_Points': 80.0},
 {'TOT_PTS_Misc': 'Gnezda, Alex', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Kirks, Damien', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Korecz, Mike', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Worden, Tom', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Burgess, Randy', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Harmon, Gary', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Smugala, Ryan', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Swartz, Brian', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Blackwell, Devon', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Blasinsky, Scott', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Bolden, Antonio', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Carter III, Laymon', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Coleman, Johnathan', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Kovach, Alex', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Smith, Ryan', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Venditti, Nick', 'Total_Points': 60.0}]

I use the following for sorting a 2d array on a number of columns我使用以下内容对多列上的二维数组进行排序

def k(a,b):
    def _k(item):
        return (item[a],item[b])
    return _k

This could be extended to work on an arbitrary number of items.这可以扩展到处理任意数量的项目。 I tend to think finding a better access pattern to your sortable keys is better than writing a fancy comparator.我倾向于认为为可排序键找到更好的访问模式比编写花哨的比较器要好。

>>> data = [[0,1,2,3,4],[0,2,3,4,5],[1,0,2,3,4]]
>>> sorted(data, key=k(0,1))
[[0, 1, 2, 3, 4], [0, 2, 3, 4, 5], [1, 0, 2, 3, 4]]
>>> sorted(data, key=k(1,0))
[[1, 0, 2, 3, 4], [0, 1, 2, 3, 4], [0, 2, 3, 4, 5]]
>>> sorted(a, key=k(2,0))
[[0, 1, 2, 3, 4], [1, 0, 2, 3, 4], [0, 2, 3, 4, 5]]
from operator import itemgetter
from functools import partial

def _neg_itemgetter(key, d):
    return -d[key]

def key_getter(key_expr):
    keys = key_expr.split(",")
    getters = []
    for k in keys:
        k = k.strip()
        if k.startswith("-"):
           getters.append(partial(_neg_itemgetter, k[1:]))

    def keyfunc(dct):
        return [kg(dct) for kg in getters]

    return keyfunc

def multikeysort(dict_list, sortkeys):
    return sorted(dict_list, key = key_getter(sortkeys)


>>> multikeysort([{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 60.0},
                 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0}, 
                 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0}],
[{u'Total_Points': 96.0, u'TOT_PTS_Misc': u'Chappell, Justin'}, 
 {u'Total_Points': 96.0, u'TOT_PTS_Misc': u'Russo, Brandon'}, 
 {u'Total_Points': 60.0, u'TOT_PTS_Misc': u'Utley, Alex'}]

The parsing is a bit fragile, but at least it allows for variable number of spaces between the keys.解析有点脆弱,但至少它允许键之间有可变数量的空格。

Since you're already comfortable with lambda, here's a less verbose solution.由于您已经对 lambda 感到满意,因此这里有一个不那么冗长的解决方案。

>>> def itemgetter(*names):
    return lambda mapping: tuple(-mapping[name[1:]] if name.startswith('-') else mapping[name] for name in names)

>>> itemgetter('a', '-b')({'a': 1, 'b': 2})
(1, -2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM