[英]Flattening list of dictionaries
I have a list of dictionaries, which may or may not have similar keys, and I want to flatten the list into a single dictionary, with a list type for values. 我有一个字典列表,可能有也可能没有相似的键,我想将列表展平为一个字典,列表类型为值。
An example of this as follows: 这方面的一个例子如下:
data = [{'category': u'Non-profit organization', 'categories': [u'Theater',
u'Bar', u'Concert Venue']}, {'category': u'Non-profit organization',
'categories': [u'Business Services', u'College & University']}]
This should become the following: 这应该成为以下内容:
print result
result = {'category': [u'Non-profit organization', u'Non-profit
organization'], 'categories': [u'Theater', u'Bar', u'Concert Venue',
u'Business Services', u'College & University']]}
As you can see, anything that is a string value in the initial data should be added as a value within a list. 如您所见,初始数据中任何字符串值都应添加为列表中的值。 Anything that is held in list type in the initial dictionary should be added it its key, but to create a flattened list. 在初始字典中以列表类型保存的任何内容都应该添加其键,但是要创建一个展平列表。
Clearly a simple solution is to for loop
through it all and append values, but am looking for a simpler solution. 显然,一个简单的解决方案是for loop
通过它并附加值,但我正在寻找一个更简单的解决方案。
Use a defaultdict(list)
: 使用defaultdict(list)
:
from collections import defaultdict
res = defaultdict(list)
for dic in data:
for key, value in dic.items():
old_value = res[key]
if isinstance(value, list):
old_value.extend(value)
else:
old_value.append(value)
# alternatively:
old_value += [value] if not isinstance(value, list) else value
The reasoning is: in the end you want all values to be list
s. 原因是:最后你想要所有的值都是list
。 The difference is that values that originally where lists should be joined together (and list.extend
does that) while other values should be inserted in a new list (as in list.append
). 区别在于最初列表应该连接在一起的值(并且list.extend
这样做),而其他值应该插入到新列表中(如list.append
)。
Also, there is no built-in method nor class in the collections
module that does this automatically, so I believe the above is probably "optimal" as far as code dimension, readability and efficiency are considered. 此外, collections
模块中没有自动执行此操作的内置方法或类,因此我认为,就代码维度,可读性和效率而言,上述内容可能是“最佳的”。
In additional previous answer (sorry, I haven't enough reputation to write a comment) you need to check if value in res
otherwise you will have repeat values: 在之前的其他答案中(抱歉,我没有足够的声誉来撰写评论)您需要检查res
值是否有重复值:
{'category': [u'Non-profit organization', u'Non-profit organization'], 'categories': [u'Theater', u'Bar', u'Concert Venue', u'Business Services', u'College & University']}
twice u'Non-profit organization'
两次u'Non-profit organization'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.