熊猫从词典列表创建数据框

Question

I have a dictionary whose keys are some user IDs and values are lists of dictionaries, take one key-value pair for example: 我有一本字典，其键是一些用户ID，值是字典列表，以一对键值对为例：

my_dict['10020'] = [{'type': 'phone', 'count': 3},
                    {'type': 'id_card', 'count': 1},
                    {'type': 'email', 'count': 2}]

Now I would like to create a pandas DataFrame, each row for a key-value pair, columns are the 'type' field within the list of dictionaries above, and values are the 'count' field respectively, like: 现在，我想创建一个pandas DataFrame，每行是一个键值对，列是上面的词典列表中的“ type”字段，而值分别是“ count”字段，例如：

    ID    phone    id_card    email
    10020    3           1        2

I have no idea how many potential 'types' are there in the dictionary, so instead of traversing the dictionary and get all 'types', is there a handy way to get the job done? 我不知道字典中有多少个潜在的“类型”，所以除了遍历字典并获得所有“类型”之外，还有什么方便的方法来完成工作？

Answer 1

Data input 数据输入

d={'10020': [{'type': 'phone', 'count': 3},
                    {'type': 'id_card', 'count': 1},
                    {'type': 'email', 'count': 2}],
 '10021': [{'type': 'phone', 'count': 33},
 {'type': 'id_card', 'count': 11},
{'type': 'email', 'count': 22}]
}

Then we using pd.concate 然后我们使用pd.concate

pd.concat([pd.DataFrame(y).set_index('type').rename(columns={'count':x}).T for x,y in d.items()])


Out[480]: 
type   phone  id_card  email
10020      3        1      2
10021     33       11     22

Answer 2

Consider some data d with variable types: 考虑一些具有变量类型的数据d ：

d = \
{
    "10021": [
        {
            "type": "fax",
            "count": 33
        },
        {
            "type": "email",
            "count": 22
        }
    ],
    "10020": [
        {
            "type": "phone",
            "count": 3
        },
        {
            "type": "id_card",
            "count": 1
        },
        {
            "type": "email",
            "count": 2
        }
    ]
}

Reshape your data as such: 这样重塑数据：

r = [{'id' : k, 'counts' : d[k]} for k in d]

Now, use json_normalize + pivot : 现在，使用json_normalize + pivot ：

df = pd.io.json.json_normalize(r, 'counts', 'id').pivot('id', 'type', 'count')
df

type   email   fax  id_card  phone
id                                
10020    2.0   NaN      1.0    3.0
10021   22.0  33.0      NaN    NaN

This should work for any type in your data. 这适用于您数据中的任何type 。

熊猫从词典列表创建数据框

问题描述

2 个解决方案

解决方案1
2 2017-11-25 04:07:11

解决方案2
1 已采纳 2017-11-25 04:34:39

熊猫从词典列表创建数据框

问题描述

2 个解决方案

解决方案1 2 2017-11-25 04:07:11

解决方案2 1 已采纳 2017-11-25 04:34:39

解决方案1
2 2017-11-25 04:07:11

解决方案2
1 已采纳 2017-11-25 04:34:39