有效地将元组列表压缩为Python中的列表字典？

Question

Question 题

I am interested in finding a more efficient ( code complexity, speed, memory usage, comprehensions, generators ) method of reducing a list of two element tuples, where the first element may be duplicated between the elements, to a dictionary of lists. 我感兴趣的是找到一种更有效的方法（ 代码复杂性，速度，内存使用，理解，生成器 ），以减少两个元素元组的列表，其中第一个元素可以在元素之间重复，从而简化为列表字典。

from copy import deepcopy
a = [('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c','horse'), ('d', 'cow')]

b = {x[0]: list() for x in a}

c = deepcopy(b)
for key, value in b.items():
    for item in a:
        if key == item[0]:
            c[key].append(item[1])
print(a)
print(c)

[('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c', 'horse'), ('d', 'cow')] [（'a'，'cat'），（'a'，'dog'），（'b'，'pony'），（'c'，'hippo'），（'c'，'horse'），（'d'，'cow'）]

{'a': ['cat', 'dog'], 'b': ['pony'], 'c': ['hippo', 'horse'], 'd': ['cow']} {'a'：['cat'，'dog']，'b'：['pony']，'c'：['hippo'，'horse']，'d'：['cow']}

Answer Testing 答案测试

from collections import defaultdict
from itertools import groupby
from operator import itemgetter
import timeit

timings = dict()

def wrap(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

a = [('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c','horse'), ('d', 'cow')]

# yatu's solution
def yatu(x):
    output = defaultdict(list)
    for item in x:
        output[item[0]].append(item[1])
    return output

# roseman's solution
def roseman(x):
    d = defaultdict(list)
    for key, value in a:
        d[key].append(value)
    return d

# prem's solution
def prem(a):
    result = {k: [v for _,v in grp] for k,grp in groupby(a, itemgetter(0))}
    return result

# timings
yatus_wrapped = wrap(yatu, a)
rosemans_wrapped = wrap(roseman, a)
prems_wrapped = wrap(prem, a)
timings['yatus'] = timeit.timeit(yatus_wrapped, number=100000)
timings['rosemans'] = timeit.timeit(rosemans_wrapped, number=100000)
timings['prems'] = timeit.timeit(prems_wrapped, number=100000)

# output results
print(timings)

{'yatus': 0.171220442, 'rosemans': 0.153767728, 'prems': 0.22808025399999993} {'yatus'：0.171220442，'rosemans'：0.153767728，'prems'：0.22808025399999993}

Roseman's solution is marginally the fastest, thank you. 罗斯曼的解决方案几乎是最快的，谢谢。

Answer 1

This can be done with a single loop using a defaultdict: 这可以通过使用defaultdict的单个循环来完成：

from collections import defaultdict
d = defaultdict(list)
for key, value in a:
    d[key].append(value)

Answer 2

You could use defaultdict : 您可以使用defaultdict ：

from collections import defaultdict
a = [('a', 'cat'), ('a', 'dog'), ('b', 'pony'), ('c', 'hippo'), ('c','horse'), ('d', 'cow')]

output = defaultdict(list)

for item in a:
    output[item[0]].append(item[1])

This approach will need less space (only a and output ) and have a better runtime (linear runtime complexity as it's iterating over a once and adding each element to the output dictionary - inserts into dictionaries happen in constant time). 这种方法将需要较少的空间（只需要a和output ），并且具有更好的运行时（线性运行时复杂性，因为它a一次迭代并将每个元素添加到output字典中-插入字典的时间是固定的）。

Answer 3

You can use itertools.groupby to group the items first and then merge them as you prefer 您可以使用itertools.groupby首先将项目分组，然后根据需要合并它们

>>> from itertools import groupby
>>> from operator import itemgetter
>>> {k: [v for _,v in grp] for k,grp in groupby(a, itemgetter(0))}
{'a': ['cat', 'dog'], 'b': ['pony'], 'c': ['hippo', 'horse'], 'd': ['cow']}

Sort the input if it wont always be in sorted order 如果输入不总是按排序顺序排序

有效地将元组列表压缩为Python中的列表字典？

问题描述

Question 题

Answer Testing 答案测试

3 个解决方案

解决方案1
1 已采纳 2019-08-28 15:58:50

解决方案2
0 2019-08-28 15:58:19

解决方案3
-1 2019-08-28 16:16:45

有效地将元组列表压缩为Python中的列表字典？

问题描述

Question 题

Answer Testing 答案测试

3 个解决方案

解决方案1 1 已采纳 2019-08-28 15:58:50

解决方案2 0 2019-08-28 15:58:19

解决方案3 -1 2019-08-28 16:16:45

解决方案1
1 已采纳 2019-08-28 15:58:50

解决方案2
0 2019-08-28 15:58:19

解决方案3
-1 2019-08-28 16:16:45