简体   繁体   English

根据键映射dicts

[英]Mapping dicts based on key

Given a list of dict s 给出一个dict list

input = [
  {'key': k1, 'value': v1},
  {'key': k1, 'value': v2},
  {'key': k2, 'value': v3}
]

What is the easiest way to to map these to the output 将这些映射到输出的最简单方法是什么

output == {k1: (v1, v2), k2: (v3)}

I don't really care about the order of the values. 我并不关心价值观的顺序。 The best I've come up with is. 我想出的最好的是。

output = dict()
for i in input:
    temp = output.get(i['key'], [])
    temp.append(i['value'])
    output[i['key']] = temp

Any slick what of doing this with dict comprehension? 有什么用dict理解这样做的吗? I'm assuming the same process would work for a list of objects with attributes as well. 我假设相同的过程也适用于具有属性的对象列表。

You can use collections.defaultdict to loop over your dictionaries values then append them to defaultdict : 您可以使用collections.defaultdict循环遍历您的词典值,然后将它们附加到defaultdict

>>> from collections import defaultdict
>>> a = defaultdict(tuple)
>>> for d in input:
...     a[d['key']] += (d['value'],)
... 
>>> a
defaultdict(<type 'tuple'>, {'k2': ('v3',), 'k1': ('v1', 'v2')})

In a dict comprehension, any single key can only be accessed or modified once. 在词典理解中,任何单个键只能被访问或修改一次。 In order to ensure that multiple values are paired with a single key, then, the values will need to be grouped in advance. 为了确保多个值与单个键配对,则需要预先对这些值进行分组。 A naive grouping solution will have, at best, quadratic performance. 天真的分组解决方案最多只能具有二次性能。 In fact, I can't come up with a one-liner that's better than cubic; 事实上,我不能想出一个比立方更好的单线; it's an ugly beast, not even worth posting. 这是一个丑陋的野兽,甚至不值得张贴。

So an approach based on defaultdict will almost always be best. 因此,基于defaultdict的方法几乎总是最好的。

However, if your data is guaranteed to be sorted, or if you're willing to accept O(n log n) performance, then you could use itertools.groupby . 但是,如果您的数据保证排序,或者您愿意接受O(n log n)性能,那么您可以使用itertools.groupby

>>> input
[{'value': 1, 'key': 'a'}, {'value': 2, 'key': 'a'}, {'value': 3, 'key': 'b'}]
>>> {k:tuple(d['value'] for d in v) for k, v in
...  itertools.groupby(input, key=lambda d: d['key'])}
{'a': (1, 2), 'b': (3,)}

To get rid of the unsightly lambda , you could use operator . 要摆脱难看的lambda ,你可以使用operator

>>> {k:tuple(d['value'] for d in v) for k, v in
...  itertools.groupby(input, key=operator.itemgetter('key'))}
{'a': (1, 2), 'b': (3,)}

Or, if you have to sort first: 或者,如果你必须先排序:

>>> {k:tuple(d['value'] for d in v) for k, v in itertools.groupby(
...  sorted(input, key=operator.itemgetter('key')),
...  key=operator.itemgetter('key'))}
{'a': (1, 2), 'b': (3,)}

None of these solutions are very attractive; 这些解决方案都没有吸引力; they all look a bit like abuses of comprehension syntax, with the possible exception of the second. 他们看起来有点像滥用理解语法,可能除了第二个。

As an alternative to importing from collections , you can use setdefault -- though this produces lists rather than tuples: 作为从collections导入的替代方法,您可以使用setdefault - 尽管这会生成列表而不是元组:

>>> output = {}
>>> for d in input:
...     output.setdefault(d['key'], []).append(d['value'])
... 
>>> output
{'a': [1, 2], 'b': [3]}

Finally, consider this alternative -- I can't tell how I feel about it, but it does avoid all imports and exotic features, and produces tuples: 最后,考虑这个替代方案 - 我不知道我对它的感受,但它确实避免了所有导入和异国情调的功能,并产生元组:

>>> output = {d['key']:() for d in input}
>>> for d in input:
...     output[d['key']] += (d['value'],)
... 
>>> output
{'a': (1, 2), 'b': (3,)}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM