[英]Sort multiple dictionaries identically, based on a specific order defined by a list
I had a special case where multiple existing dictionaries had to be sorted based on the exact order of items in a list (not alphabetical). 我有一个特殊情况,必须根据列表中项目的确切顺序(不是按字母顺序排序)对多个现有字典进行排序。 So for example the dictionaries were:
例如,字典是:
dict_one = {"LastName": "Bar", "FirstName": "Foo", "Address": "Example Street 101", "Phone": "012345678"}
dict_two = {"Phone": "001122334455", "LastName": "Spammer", "FirstName": "Egg", "Address": "SSStreet 123"}
dict_three = {"Address": "Run Down Street 66", "Phone": "0987654321", "LastName": "Biker", "FirstName": "Random"}
And the list was: 名单是:
data_order = ["FirstName", "LastName", "Phone", "Address"]
With the expected result being the ability to create a file like this: 预期的结果是能够创建这样的文件:
FirstName;LastName;Phone;Address
Foo;Bar;012345678;Example Street 101
Egg;Spammer;001122334455;SSStreet 123
Random;Biker;0987654321;Run Down Street 66
Note : In my case, the real use was an Excel file using pyexcel-xls, but the CSV-like example above is probably closer to what is usually done, so the answers might be more universally applicable for CSV than Excel. 注意 :在我的情况下,实际使用的是使用pyexcel-xls的Excel文件,但上面的类似CSV的示例可能更接近通常所做的,因此答案可能更普遍适用于CSV而不是Excel。
I had a bit of hard time to find any good answers in Stack Overflow for this case, but eventually I got the sorting working, which I could use to create the file. 在这种情况下,我有点困难在Stack Overflow中找到任何好的答案,但最终我得到了排序工作,我可以使用它来创建文件。 The header row can simply be taken directly from the
data_order
list below. 标题行可以直接从下面的
data_order
列表中获取。 Here's how I did it - hope it helps someone: 这是我如何做到的 - 希望它可以帮助某人:
from collections import OrderedDict
import pprint
dict_one = {
"LastName": "Bar",
"FirstName": "Foo",
"Address": "Example Street 101",
"Phone": "012345678"}
dict_two = {
"Phone": "001122334455",
"LastName": "Spammer",
"FirstName": "Egg",
"Address": "SSStreet 123"}
dict_three = {
"Address": "Run Down Street 66",
"Phone": "0987654321",
"LastName": "Biker",
"FirstName": "Random"}
dict_list = []
dict_list.append(dict_one)
dict_list.append(dict_two)
dict_list.append(dict_three)
data_order = ["FirstName", "LastName", "Phone", "Address"]
result = []
for dictionary in dict_list:
result_dict = OrderedDict()
# Go through the data_order in order
for key in data_order:
# Populate result_dict in the list order
result_dict[key] = dictionary[key]
result.append(result_dict)
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(result)
"""
[ { 'FirstName': 'Foo',
'LastName': 'Bar',
'Phone': '012345678',
'Address': 'Example Street 101'},
{ 'FirstName': 'Egg',
'LastName': 'Spammer',
'Phone': '001122334455',
'Address': 'SSStreet 123'},
{ 'FirstName': 'Random',
'LastName': 'Biker',
'Phone': '0987654321',
'Address': 'Run Down Street 66'}]
"""
This can be achieved in a one liner, although it is harder to read. 这可以在一个衬里中实现,尽管它更难以阅读。 In case it is useful for someone:
如果它对某人有用:
print [OrderedDict([(key, d[key]) for key in data_order]) for d in [dict_one, dict_two, dict_three]]
This is a classic use case for csv.DictWriter
, because your expected output is CSV-like (semi-colon delimiters instead of commas is supported) which would handle all of this for you, avoiding the need for ridiculous workaround involving OrderedDict
, and making it easy to read the data back in without worrying about corner cases ( csv
automatically quotes fields if necessary, and parses quoted fields on read in as needed): 这是
csv.DictWriter
的经典用例,因为您的预期输出是类似CSV的(支持半冒号分隔符而不是逗号),它可以为您处理所有这些,避免需要涉及OrderedDict
荒谬解决方法,以及制作很容易读回数据而不用担心极端情况( csv
在必要时自动引用字段,并根据需要解析引入的字段):
with open('outputfile.txt', 'w', newline='') as f:
csvout = csv.DictWriter(f, data_order, delimiter=';')
# Write the header
csvout.writeheader()
csvout.writerow(dict_one)
csvout.writerow(dict_two)
csvout.writerow(dict_three)
That's it, csv
handles ordering, (it knows the correct order from the data_order
passed as fieldnames
to the DictWriter
constructor), formatting, etc. 就是这样,
csv
处理排序,(它知道从作为fieldnames
data_order
传递给DictWriter
构造函数的data_order
的正确顺序),格式化等。
If you had some need to pull the values in a specific order from many dict
s without writing them (since your use case doesn't even use the keys), operator.itemgetter
can be used to simplify this dramatically: 如果您需要从许多
dict
中按特定顺序提取值而不编写它们(因为您的用例甚至不使用键),可以使用operator.itemgetter
来显着简化:
from operator import itemgetter
getfields = itemgetter(*data_order)
dict_one_fields = getfields(dict_one)
which leaves dict_one_fields
as a tuple
with the requested fields in the requested order, ('Foo', 'Bar', '012345678', 'Example Street 101')
, and runs significantly faster than repeatedly indexing at the Python layer ( itemgetter
creates a C level "functor" that can retrieve all the requested values in a single call, with no Python level byte code at all for built-in keys like str
). 将
dict_one_fields
请求顺序中所请求字段的tuple
('Foo', 'Bar', '012345678', 'Example Street 101')
,并且运行速度明显快于在Python层重复索引( itemgetter
创建一个C级“functor”,可以在一次调用中检索所有请求的值,对于像str
这样的内置键,根本没有Python级字节代码。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.