Python：將字典中的列表項分組

Question

我想從字典列表中生成字典，然后按某些鍵的值將列表項分組，例如：

input_list = [
        {'a':'tata', 'b': 'foo'},
        {'a':'pipo', 'b': 'titi'},
        {'a':'pipo', 'b': 'toto'},
        {'a':'tata', 'b': 'bar'}
]
output_dict = {
        'pipo': [
             {'a': 'pipo', 'b': 'titi'}, 
             {'a': 'pipo', 'b': 'toto'}
         ],
         'tata': [
             {'a': 'tata', 'b': 'foo'},
             {'a': 'tata', 'b': 'bar'}
         ]
}

到目前為止，我已經找到了兩種方法。 第一個簡單地遍歷列表，在字典中為每個鍵值創建子列表，並將與這些鍵匹配的元素追加到子列表中：

l = [ 
    {'a':'tata', 'b': 'foo'},
    {'a':'pipo', 'b': 'titi'},
    {'a':'pipo', 'b': 'toto'},
    {'a':'tata', 'b': 'bar'}
    ]

res = {}

for e in l:
    res[e['a']] = res.get(e['a'], []) 
    res[e['a']].append(e)

另一個使用itertools.groupby ：

import itertools
from operator import itemgetter

l = [ 
        {'a':'tata', 'b': 'foo'},
        {'a':'pipo', 'b': 'titi'},
        {'a':'pipo', 'b': 'toto'},
        {'a':'tata', 'b': 'bar'}
]

l = sorted(l, key=itemgetter('a'))
res = dict((k, list(g)) for k, g in itertools.groupby(l, key=itemgetter('a')))

我想知道哪種選擇最有效？

有沒有更多的pythonic / concise或更好的方法來實現這一目標？

Answer 1

您想通過列表元素的'a'鍵的值對輸入列表進行分組是否正確？ 如果是這樣，您的第一種方法是最好的，是一個小的改進，請使用dict.setdefault ：

res = {}
for item in l:
    res.setdefault(item['a'], []).append(item)

Answer 2

一班輪-

>>> import itertools
>>> input_list = [
...         {'a':'tata', 'b': 'foo'},
...         {'a':'pipo', 'b': 'titi'},
...         {'a':'pipo', 'b': 'toto'},
...         {'a':'tata', 'b': 'bar'}
... ]
>>> {k:[v for v in input_list if v['a'] == k] for k, val in itertools.groupby(input_list,lambda x: x['a'])}
{'tata': [{'a': 'tata', 'b': 'foo'}, {'a': 'tata', 'b': 'bar'}], 'pipo': [{'a': 'pipo', 'b': 'titi'}, {'a': 'pipo', 'b': 'toto'}]}

Answer 3

如果有效你的意思是“時間效率”，也可以使用來衡量它timeit內置模塊。

例如：

import timeit
import itertools
from operator import itemgetter

input = [{'a': 'tata', 'b': 'foo'},
         {'a': 'pipo', 'b': 'titi'},
         {'a': 'pipo', 'b': 'toto'},
         {'a': 'tata', 'b': 'bar'}]

def solution1():
    res = {}
    for e in input:
        res[e['a']] = res.get(e['a'], [])
        res[e['a']].append(e)
    return res

def solution2():
    l = sorted(input, key=itemgetter('a'))
    res = dict(
        (k, list(g)) for k, g in itertools.groupby(l, key=itemgetter('a'))
    )
    return res

t = timeit.Timer(solution1)
print(t.timeit(10000))
# 0.0122511386871

t = timeit.Timer(solution2)
print(t.timeit(10000))
# 0.0366218090057

請參考timeit官方文檔以獲取更多信息。

Answer 4

最好的方法是您提到的第一個方法，您甚至可以通過使用以上bernhard提到的setdefault使其更優雅。 這種方法的復雜度為O（n），因為我們只需對輸入進行一次迭代，然后對每一項進行查找，就可以對正在構建的輸出字典進行查找，以找到要附加到其上的適當列表，這需要花費固定時間（lookup +附加）。 因此總體復雜度為O（n），這是最優的。

使用itertools.groupby時，必須預先對輸入進行排序（為O（n log n））。

Python：將字典中的列表項分組

問題描述

4 個解決方案

解決方案1
14 已采納 2015-06-26 11:25:51

解決方案2
5 2015-06-26 11:32:40

解決方案3
4 2015-06-26 18:11:23

解決方案4
1 2015-06-26 13:56:57

Python：將字典中的列表項分組

問題描述

4 個解決方案

解決方案1 14 已采納 2015-06-26 11:25:51

解決方案2 5 2015-06-26 11:32:40

解決方案3 4 2015-06-26 18:11:23

解決方案4 1 2015-06-26 13:56:57

解決方案1
14 已采納 2015-06-26 11:25:51

解決方案2
5 2015-06-26 11:32:40

解決方案3
4 2015-06-26 18:11:23

解決方案4
1 2015-06-26 13:56:57