简体   繁体   English

如何按字典的值对字典列表进行排序?

[英]How do I sort a list of dictionaries by a value of the dictionary?

How do I sort a list of dictionaries by a specific key's value?如何按特定键的值对字典列表进行排序? Given:鉴于:

[{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]

When sorted by name , it should become:name排序时,应变为:

[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]

The sorted() function takes a key= parameter sorted()函数采用key=参数

newlist = sorted(list_to_be_sorted, key=lambda d: d['name']) 

Alternatively, you can use operator.itemgetter instead of defining the function yourself或者,您可以使用operator.itemgetter而不是自己定义函数

from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name')) 

For completeness, add reverse=True to sort in descending order为了完整起见,添加reverse=True以降序排序

newlist = sorted(list_to_be_sorted, key=itemgetter('name'), reverse=True)
import operator

To sort the list of dictionaries by key='name':按 key='name' 对字典列表进行排序:

list_of_dicts.sort(key=operator.itemgetter('name'))

To sort the list of dictionaries by key='age':按 key='age' 对字典列表进行排序:

list_of_dicts.sort(key=operator.itemgetter('age'))
my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

my_list.sort(lambda x,y : cmp(x['name'], y['name']))

my_list will now be what you want. my_list现在将是您想要的。

Or better:或更好:

Since Python 2.4, there's a key argument is both more efficient and neater:从 Python 2.4 开始,有一个key参数更高效、更整洁:

my_list = sorted(my_list, key=lambda k: k['name'])

...the lambda is, IMO, easier to understand than operator.itemgetter , but your mileage may vary. ... lambda 是,IMO,比operator.itemgetter更容易理解,但你的里程可能会有所不同。

If you want to sort the list by multiple keys, you can do the following:如果要按多个键对列表进行排序,可以执行以下操作:

my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))

It is rather hackish, since it relies on converting the values into a single string representation for comparison, but it works as expected for numbers including negative ones (although you will need to format your string appropriately with zero paddings if you are using numbers).这是相当骇人听闻的,因为它依赖于将值转换为单个字符串表示形式进行比较,但它对包括负数在内的数字按预期工作(尽管如果您使用数字,则需要使用零填充适当地格式化您的字符串)。

a = [{'name':'Homer', 'age':39}, ...]

# This changes the list a
a.sort(key=lambda k : k['name'])

# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name']) 
import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))

'key' is used to sort by an arbitrary value and 'itemgetter' sets that value to each item's 'name' attribute. 'key' 用于按任意值排序,'itemgetter' 将该值设置为每个项目的 'name' 属性。

I guess you've meant:我猜你的意思是:

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

This would be sorted like this:这将像这样排序:

sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))

Using the Schwartzian transform from Perl,使用 Perl 的Schwartzian 变换

py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

do

sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]

gives

>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]

More on the Perl Schwartzian transform:更多关于 Perl Schwartzian 变换:

In computer science, the Schwartzian transform is a Perl programming idiom used to improve the efficiency of sorting a list of items.在计算机科学中,Schwartzian 变换是一种 Perl 编程习惯,用于提高对项目列表进行排序的效率。 This idiom is appropriate for comparison-based sorting when the ordering is actually based on the ordering of a certain property (the key) of the elements, where computing that property is an intensive operation that should be performed a minimal number of times.当排序实际上是基于元素的某个属性(键)的排序时,此习惯用法适用于基于比较的排序,其中计算该属性是一项应执行最少次数的密集操作。 The Schwartzian Transform is notable in that it does not use named temporary arrays. Schwartzian 变换值得注意的是它不使用命名的临时数组。

You could use a custom comparison function, or you could pass in a function that calculates a custom sort key.您可以使用自定义比较函数,也可以传入一个计算自定义排序键的函数。 That's usually more efficient as the key is only calculated once per item, while the comparison function would be called many more times.这通常更有效,因为每个项目只计算一次键,而比较函数会被调用更多次。

You could do it this way:你可以这样做:

def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)

But the standard library contains a generic routine for getting items of arbitrary objects: itemgetter .但是标准库包含一个用于获取任意对象项的通用例程: itemgetter So try this instead:所以试试这个:

from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))

You have to implement your own comparison function that will compare the dictionaries by values of name keys.您必须实现自己的比较函数,该函数将按名称键的值比较字典。 See Sorting Mini-HOW TO from PythonInfo Wiki请参阅PythonInfo Wiki 中的 Sorting Mini-HOW TO

Sometimes we need to use lower() .有时我们需要使用lower() For example,例如,

lists = [{'name':'Homer', 'age':39},
  {'name':'Bart', 'age':10},
  {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

Using the Pandas package is another method, though its runtime at large scale is much slower than the more traditional methods proposed by others:使用Pandas包是另一种方法,尽管它在大规模运行时比其他人提出的更传统的方法慢得多:

import pandas as pd

listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()

Here are some benchmark values for a tiny list and a large (100k+) list of dicts:以下是小列表和大 (100k+) dicts 列表的一些基准值:

setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))

#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807

Here is the alternative general solution - it sorts elements of a dict by keys and values.这是另一种通用解决方案 - 它通过键和值对 dict 的元素进行排序。

The advantage of it - no need to specify keys, and it would still work if some keys are missing in some of dictionaries.它的优点 - 无需指定键,如果某些字典中缺少某些键,它仍然可以工作。

def sort_key_func(item):
    """ Helper function used to sort list of dicts

    :param item: dict
    :return: sorted list of tuples (k, v)
    """
    pairs = []
    for k, v in item.items():
        pairs.append((k, v))
    return sorted(pairs)
sorted(A, key=sort_key_func)

Let's say I have a dictionary D with the elements below.假设我有一本包含以下元素的字典D To sort, just use the key argument in sorted to pass a custom function as below:要排序,只需使用sorted中的 key 参数来传递自定义函数,如下所示:

D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
    return tuple[1]

sorted(D.items(), key = get_count, reverse=True)
# Or
sorted(D.items(), key = lambda x: x[1], reverse=True)  # Avoiding get_count function call

Check this out.看看这个

If you do not need the original list of dictionaries , you could modify it in-place with sort() method using a custom key function.如果您不需要dictionaries的原始list ,您可以使用自定义键函数使用sort()方法就地修改它。

Key function:关键功能:

def get_name(d):
    """ Return the value of a key in a dictionary. """

    return d["name"]

The list to be sorted:要排序的list

data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]

Sorting it in-place:就地排序:

data_one.sort(key=get_name)

If you need the original list , call the sorted() function passing it the list and the key function, then assign the returned sorted list to a new variable:如果您需要原始list ,请调用sorted()函数,将list和键函数传递给它,然后将返回的排序list分配给一个新变量:

data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)

Printing data_one and new_data .打印data_onenew_data

>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]

I have been a big fan of a filter with lambda.我一直是 lambda 过滤器的忠实粉丝。 However, it is not best option if you consider time complexity.但是,如果考虑时间复杂度,这不是最佳选择。

First option第一个选项

sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# Returns list of values

Second option第二种选择

list_to_sort.sort(key=operator.itemgetter('name'))
# Edits the list, and does not return a new list

Fast comparison of execution times执行时间的快速比较

# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"

1000000 loops, best of 3: 0.736 µsec per loop 1000000 次循环,3 次中的最佳:每个循环 0.736 微秒

# Second option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"

1000000 loops, best of 3: 0.438 µsec per loop 1000000 次循环,3 次中的最佳:每个循环 0.438 微秒

If performance is a concern, I would use operator.itemgetter instead of lambda as built-in functions perform faster than hand-crafted functions.如果性能是一个问题,我会使用operator.itemgetter而不是lambda ,因为内置函数比手工函数执行得更快。 The itemgetter function seems to perform approximately 20% faster than lambda based on my testing.根据我的测试, itemgetter函数的执行速度似乎比lambda快 20%。

From https://wiki.python.org/moin/PythonSpeed :来自https://wiki.python.org/moin/PythonSpeed

Likewise, the builtin functions run faster than hand-built equivalents.同样,内置函数比手动构建的等效函数运行得更快。 For example, map(operator.add, v1, v2) is faster than map(lambda x,y: x+y, v1, v2).例如,map(operator.add, v1, v2) 比 map(lambda x,y: x+y, v1, v2) 快。

Here is a comparison of sorting speed using lambda vs itemgetter .这是使用lambdaitemgetter的排序速度比较。

import random
import operator

# Create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]

# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Check that each technique produces the same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True

Both techniques sort the list in the same order (verified by execution of the final statement in the code block), but the first one is a little faster.两种技术都以相同的顺序对列表进行排序(通过执行代码块中的最后一条语句来验证),但第一种技术要快一些。

As indicated by @Claudiu to @monojohnny in comment section of this answer ,正如@Claudiu 在此答案的评论部分中向@monojohnny 指出的那样,
given:给定:

list_to_be_sorted = [
                      {'name':'Homer', 'age':39}, 
                      {'name':'Milhouse', 'age':10}, 
                      {'name':'Bart', 'age':10} 
                    ]

to sort the list of dictionaries by key 'age' , 'name''age''name'键对字典列表进行排序
(like in SQL statement ORDER BY age, name ), you can use: (就像在 SQL 语句中ORDER BY age, name ),你可以使用:

newlist = sorted( list_to_be_sorted, key=lambda k: (k['age'], k['name']) )

or, likewise或者,同样

import operator
newlist = sorted( list_to_be_sorted, key=operator.itemgetter('age','name') )

print(newlist)

[{'name': 'Bart', 'age': 10}, [{'name': '巴特', '年龄': 10},
{'name': 'Milhouse', 'age': 10}, {'name': 'Milhouse', '年龄': 10},
{'name': 'Homer', 'age': 39}] {'name': 'Homer', 'age': 39}]

sorting by multiple columns, while in descending order on some of them: the cmps array is global to the cmp function, containing field names and inv == -1 for desc 1 for asc按多列排序,其中一些列按降序排列: cmp 数组对 cmp 函数是全局的,包含字段名称和 inv == -1 用于 desc 1 用于 asc

def cmpfun(a, b):
    for (name, inv) in cmps:
        res = cmp(a[name], b[name])
        if res != 0:
            return res * inv
    return 0

data = [
    dict(name='alice', age=10), 
    dict(name='baruch', age=9), 
    dict(name='alice', age=11),
]

all_cmps = [
    [('name', 1), ('age', -1)], 
    [('name', 1), ('age', 1)], 
    [('name', -1), ('age', 1)],]

print 'data:', data
for cmps in all_cmps: print 'sort:', cmps; print sorted(data, cmpfun)

您可以使用以下代码

sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM