簡體   English   中英

如何在Python中按多個鍵對對象進行排序?

[英]How to sort objects by multiple keys in Python?

或者,實際上,如何按多個鍵對字典列表進行排序?

我有一個字典列表:

b = [{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0},
 {u'TOT_PTS_Misc': u'Foster, Toney', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lawson, Roman', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Lempke, Sam', u'Total_Points': 80.0},
 {u'TOT_PTS_Misc': u'Gnezda, Alex', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Kirks, Damien', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Worden, Tom', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Korecz, Mike', u'Total_Points': 78.0},
 {u'TOT_PTS_Misc': u'Swartz, Brian', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Burgess, Randy', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Smugala, Ryan', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Harmon, Gary', u'Total_Points': 66.0},
 {u'TOT_PTS_Misc': u'Blasinsky, Scott', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Carter III, Laymon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Coleman, Johnathan', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Venditti, Nick', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Blackwell, Devon', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Kovach, Alex', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Bolden, Antonio', u'Total_Points': 60.0},
 {u'TOT_PTS_Misc': u'Smith, Ryan', u'Total_Points': 60.0}]

我需要使用由 Total_Points 反轉的多鍵排序,然后不由TOT_PTS_Misc反轉。

這可以在命令提示符下完成,如下所示:

a = sorted(b, key=lambda d: (-d['Total_Points'], d['TOT_PTS_Misc']))

但是我必須通過一個函數來運行它,在那里我傳入列表和排序鍵。 例如, def multikeysort(dict_list, sortkeys):

如何使用 lambda 行對列表進行排序,對於傳入 multikeysort 函數的任意數量的鍵,並考慮到 sortkeys 可能具有任意數量的鍵,並且將識別需要反向排序的鍵前面有'-'嗎?

此答案適用於字典中的任何類型的列——否定列不必是數字。

def multikeysort(items, columns):
    from operator import itemgetter
    comparers = [((itemgetter(col[1:].strip()), -1) if col.startswith('-') else
                  (itemgetter(col.strip()), 1)) for col in columns]
    def comparer(left, right):
        for fn, mult in comparers:
            result = cmp(fn(left), fn(right))
            if result:
                return mult * result
        else:
            return 0
    return sorted(items, cmp=comparer)

你可以這樣稱呼它:

b = [{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 96.0},
     {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0},
     {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0},
     {u'TOT_PTS_Misc': u'Foster, Toney', u'Total_Points': 80.0},
     {u'TOT_PTS_Misc': u'Lawson, Roman', u'Total_Points': 80.0},
     {u'TOT_PTS_Misc': u'Lempke, Sam', u'Total_Points': 80.0},
     {u'TOT_PTS_Misc': u'Gnezda, Alex', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Kirks, Damien', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Worden, Tom', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Korecz, Mike', u'Total_Points': 78.0},
     {u'TOT_PTS_Misc': u'Swartz, Brian', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Burgess, Randy', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Smugala, Ryan', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Harmon, Gary', u'Total_Points': 66.0},
     {u'TOT_PTS_Misc': u'Blasinsky, Scott', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Carter III, Laymon', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Coleman, Johnathan', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Venditti, Nick', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Blackwell, Devon', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Kovach, Alex', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Bolden, Antonio', u'Total_Points': 60.0},
     {u'TOT_PTS_Misc': u'Smith, Ryan', u'Total_Points': 60.0}]

a = multikeysort(b, ['-Total_Points', 'TOT_PTS_Misc'])
for item in a:
    print item

嘗試否定任一列。 您將看到排序順序相反。

下一步:更改它,使其不使用額外的類....


2016-01-17

從這個答案中汲取靈感 從匹配條件的可迭代對象中獲取第一個項目的最佳方法是什么? ,我縮短了代碼:

from operator import itemgetter as i

def multikeysort(items, columns):
    comparers = [
        ((i(col[1:].strip()), -1) if col.startswith('-') else (i(col.strip()), 1))
        for col in columns
    ]
    def comparer(left, right):
        comparer_iter = (
            cmp(fn(left), fn(right)) * mult
            for fn, mult in comparers
        )
        return next((result for result in comparer_iter if result), 0)
    return sorted(items, cmp=comparer)

如果您喜歡簡潔的代碼。


稍后 2016-01-17

這適用於 python3(它消除了sortcmp參數):

from operator import itemgetter as i
from functools import cmp_to_key

def cmp(x, y):
    """
    Replacement for built-in function cmp that was removed in Python 3

    Compare the two objects x and y and return an integer according to
    the outcome. The return value is negative if x < y, zero if x == y
    and strictly positive if x > y.

    https://portingguide.readthedocs.io/en/latest/comparisons.html#the-cmp-function
    """

    return (x > y) - (x < y)

def multikeysort(items, columns):
    comparers = [
        ((i(col[1:].strip()), -1) if col.startswith('-') else (i(col.strip()), 1))
        for col in columns
    ]
    def comparer(left, right):
        comparer_iter = (
            cmp(fn(left), fn(right)) * mult
            for fn, mult in comparers
        )
        return next((result for result in comparer_iter if result), 0)
    return sorted(items, key=cmp_to_key(comparer))

受此答案啟發我應該如何在 Python 3 中進行自定義排序?

本文對執行此操作的各種技術進行了很好的概述。 如果您的要求比“全雙向多鍵”更簡單,請查看。 很明顯,接受的答案和我剛剛引用的博客文章在某種程度上相互影響,盡管我不知道哪個順序。

如果鏈接失效,這里有一個上面未涵蓋的示例的非常快速的概要:

mylist = sorted(mylist, key=itemgetter('name', 'age'))
mylist = sorted(mylist, key=lambda k: (k['name'].lower(), k['age']))
mylist = sorted(mylist, key=lambda k: (k['name'].lower(), -k['age']))

我知道這是一個相當古老的問題,但沒有一個答案提到 Python 保證其排序例程的穩定排序順序,例如list.sort()sorted() ,這意味着比較相等的項目保留其原始順序。

這意味着字典列表的ORDER BY name ASC, age DESC (使用 SQL 符號)的等價物可以這樣完成:

items.sort(key=operator.itemgetter('age'), reverse=True)
items.sort(key=operator.itemgetter('name'))

請注意項目如何首先按“較小”屬性age (降序)排序,然后按“主要”屬性name排序,從而得出正確的最終順序。

反轉/反轉適用於所有可排序類型,而不僅僅是您可以通過在前面放置減號來否定的數字。

而且由於(至少)CPython 中使用了 Timsort 算法,這實際上在實踐中相當快。

def sortkeypicker(keynames):
    negate = set()
    for i, k in enumerate(keynames):
        if k[:1] == '-':
            keynames[i] = k[1:]
            negate.add(k[1:])
    def getit(adict):
       composite = [adict[k] for k in keynames]
       for i, (k, v) in enumerate(zip(keynames, composite)):
           if k in negate:
               composite[i] = -v
       return composite
    return getit

a = sorted(b, key=sortkeypicker(['-Total_Points', 'TOT_PTS_Misc']))

我今天遇到了類似的問題 - 我必須通過降序數值和升序字符串值對字典項進行排序。 為了解決方向沖突的問題,我否定了整數值。

這是我的解決方案的一個變體 - 適用於 OP

sorted(b, key=lambda e: (-e['Total_Points'], e['TOT_PTS_Misc']))

非常簡單 - 就像一個魅力

[{'TOT_PTS_Misc': 'Chappell, Justin', 'Total_Points': 96.0},
 {'TOT_PTS_Misc': 'Russo, Brandon', 'Total_Points': 96.0},
 {'TOT_PTS_Misc': 'Utley, Alex', 'Total_Points': 96.0},
 {'TOT_PTS_Misc': 'Foster, Toney', 'Total_Points': 80.0},
 {'TOT_PTS_Misc': 'Lawson, Roman', 'Total_Points': 80.0},
 {'TOT_PTS_Misc': 'Lempke, Sam', 'Total_Points': 80.0},
 {'TOT_PTS_Misc': 'Gnezda, Alex', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Kirks, Damien', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Korecz, Mike', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Worden, Tom', 'Total_Points': 78.0},
 {'TOT_PTS_Misc': 'Burgess, Randy', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Harmon, Gary', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Smugala, Ryan', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Swartz, Brian', 'Total_Points': 66.0},
 {'TOT_PTS_Misc': 'Blackwell, Devon', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Blasinsky, Scott', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Bolden, Antonio', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Carter III, Laymon', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Coleman, Johnathan', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Kovach, Alex', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Smith, Ryan', 'Total_Points': 60.0},
 {'TOT_PTS_Misc': 'Venditti, Nick', 'Total_Points': 60.0}]

我使用以下內容對多列上的二維數組進行排序

def k(a,b):
    def _k(item):
        return (item[a],item[b])
    return _k

這可以擴展到處理任意數量的項目。 我傾向於認為為可排序鍵找到更好的訪問模式比編寫花哨的比較器要好。

>>> data = [[0,1,2,3,4],[0,2,3,4,5],[1,0,2,3,4]]
>>> sorted(data, key=k(0,1))
[[0, 1, 2, 3, 4], [0, 2, 3, 4, 5], [1, 0, 2, 3, 4]]
>>> sorted(data, key=k(1,0))
[[1, 0, 2, 3, 4], [0, 1, 2, 3, 4], [0, 2, 3, 4, 5]]
>>> sorted(a, key=k(2,0))
[[0, 1, 2, 3, 4], [1, 0, 2, 3, 4], [0, 2, 3, 4, 5]]
from operator import itemgetter
from functools import partial

def _neg_itemgetter(key, d):
    return -d[key]

def key_getter(key_expr):
    keys = key_expr.split(",")
    getters = []
    for k in keys:
        k = k.strip()
        if k.startswith("-"):
           getters.append(partial(_neg_itemgetter, k[1:]))
        else:
           getters.append(itemgetter(k))

    def keyfunc(dct):
        return [kg(dct) for kg in getters]

    return keyfunc

def multikeysort(dict_list, sortkeys):
    return sorted(dict_list, key = key_getter(sortkeys)

示范:

>>> multikeysort([{u'TOT_PTS_Misc': u'Utley, Alex', u'Total_Points': 60.0},
                 {u'TOT_PTS_Misc': u'Russo, Brandon', u'Total_Points': 96.0}, 
                 {u'TOT_PTS_Misc': u'Chappell, Justin', u'Total_Points': 96.0}],
                "-Total_Points,TOT_PTS_Misc")
[{u'Total_Points': 96.0, u'TOT_PTS_Misc': u'Chappell, Justin'}, 
 {u'Total_Points': 96.0, u'TOT_PTS_Misc': u'Russo, Brandon'}, 
 {u'Total_Points': 60.0, u'TOT_PTS_Misc': u'Utley, Alex'}]

解析有點脆弱,但至少它允許鍵之間有可變數量的空格。

由於您已經對 lambda 感到滿意,因此這里有一個不那么冗長的解決方案。

>>> def itemgetter(*names):
    return lambda mapping: tuple(-mapping[name[1:]] if name.startswith('-') else mapping[name] for name in names)

>>> itemgetter('a', '-b')({'a': 1, 'b': 2})
(1, -2)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM