简体   繁体   English

扁平化词典列表

[英]Flattening list of dictionaries

I have a list of dictionaries, which may or may not have similar keys, and I want to flatten the list into a single dictionary, with a list type for values. 我有一个字典列表,可能有也可能没有相似的键,我想将列表展平为一个字典,列表类型为值。

An example of this as follows: 这方面的一个例子如下:

data =  [{'category': u'Non-profit organization', 'categories': [u'Theater',
      u'Bar', u'Concert Venue']}, {'category': u'Non-profit organization', 
      'categories': [u'Business Services', u'College & University']}]

This should become the following: 这应该成为以下内容:

print result
result = {'category': [u'Non-profit organization', u'Non-profit 
      organization'], 'categories': [u'Theater', u'Bar', u'Concert Venue',       
      u'Business Services', u'College & University']]}

As you can see, anything that is a string value in the initial data should be added as a value within a list. 如您所见,初始数据中任何字符串值都应添加为列表中的值。 Anything that is held in list type in the initial dictionary should be added it its key, but to create a flattened list. 在初始字典中以列表类型保存的任何内容都应该添加其键,但是要创建一个展平列表。

Clearly a simple solution is to for loop through it all and append values, but am looking for a simpler solution. 显然,一个简单的解决方案是for loop通过它并附加值,但我正在寻找一个更简单的解决方案。

Use a defaultdict(list) : 使用defaultdict(list)

from collections import defaultdict

res = defaultdict(list)
for dic in data:
    for key, value in dic.items():
        old_value = res[key]
        if isinstance(value, list):
            old_value.extend(value)
        else:
            old_value.append(value)

        # alternatively:
        old_value += [value] if not isinstance(value, list) else value

The reasoning is: in the end you want all values to be list s. 原因是:最后你想要所有的值都是list The difference is that values that originally where lists should be joined together (and list.extend does that) while other values should be inserted in a new list (as in list.append ). 区别在于最初列表应该连接在一起的值(并且list.extend这样做),而其他值应该插入到新列表中(如list.append )。


Also, there is no built-in method nor class in the collections module that does this automatically, so I believe the above is probably "optimal" as far as code dimension, readability and efficiency are considered. 此外, collections模块中没有自动执行此操作的内置方法或类,因此我认为,就代码维度,可读性和效率而言,上述内容可能是“最佳的”。

In additional previous answer (sorry, I haven't enough reputation to write a comment) you need to check if value in res otherwise you will have repeat values: 在之前的其他答案中(抱歉,我没有足够的声誉来撰写评论)您需要检查res值是否有重复值:

{'category': [u'Non-profit organization', u'Non-profit organization'], 'categories': [u'Theater', u'Bar', u'Concert Venue', u'Business Services', u'College & University']}

twice u'Non-profit organization' 两次u'Non-profit organization'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM