简体   繁体   English

将具有不同键的字典列表转换为数据框

[英]Convert a list of dictionaries with varying keys to a dataframe

How do I get a list of dictionaries converted into a dataframe whose columns are 'Event', 'Id', 'Name' ? 如何获取字典列表,这些字典已转换为列为'Event', 'Id', 'Name'的数据框?

sample = [{'event': 'up', '53118': 'Harry'},
                  {'event': 'up', '51880': 'Smith'}, 
                  {'event': 'down', '51659': 'Joe'}, 
                  {'52983': 'Sam', 'event': 'up'}, 
                  {'event': 'down', '52917': 'Roger'},
                  {'event': 'up', '314615': 'Julie'},
                  {'event': 'left', '276298': 'Andrew'},
                  {'event': 'right', '457249': 'Carlos'}, 
                  {'event': 'down', '391485': 'Jason'},
                  {'event': 'right', '53191': 'Taylor'}, 
                  {'51248': 'Benjy', 'event': 'down'}]

pd.DataFrame(sample) would return; pd.DataFrame(sample)将返回;

在此处输入图片说明

Is there a pythonic panda-ic way to convert it to this form? 是否有将其转换为这种形式的pythonic panda-ic方法?

Event   Id      Name
up     53118    Harry
up     51880    Smith
down   51659    Joe

pd.melt can get you most of the way, starting from your df = pd.DataFrame(sample) : df = pd.DataFrame(sample)开始, pd.melt可以为您提供大部分df = pd.DataFrame(sample)

In [74]: m = pd.melt(df, id_vars="event", var_name="Id", value_name="Name").dropna()

In [75]: m
Out[75]: 
     event      Id    Name
6     left  276298  Andrew
16      up  314615   Julie
30    down  391485   Jason
40   right  457249  Carlos
54    down   51248   Benjy
57    down   51659     Joe
67      up   51880   Smith
81    down   52917   Roger
91      up   52983     Sam
99      up   53118   Harry
119  right   53191  Taylor

And then you can do some cleanup ( reset_index(drop=True) , rename(columns={"event": "Event"}) , convert Id to integers, etc.) 然后,您可以执行一些清理操作( reset_index(drop=True)rename(columns={"event": "Event"}) ,将Id转换为整数等)

Since @eumiro makes a good point, we could also implement @MattDMo's suggestion easily enough: 由于@eumiro很好,我们也可以很容易地实现@MattDMo的建议:

In [90]: sample = [dict(event=d.pop("event"), id=min(d), name=min(d.values())) for d in sample]

In [91]: pd.DataFrame(sample)
Out[91]: 
    event      id    name
0      up   53118   Harry
1      up   51880   Smith
2    down   51659     Joe
3      up   52983     Sam
4    down   52917   Roger
5      up  314615   Julie
6    left  276298  Andrew
7   right  457249  Carlos
8    down  391485   Jason
9   right   53191  Taylor
10   down   51248   Benjy

Here I've taken advantage of the fact that once we pop event there's only one element in the dictionary left, but a more manual loop would work as easily. 在这里,我利用了这样一个事实:一旦我们弹出event ,字典中只剩下一个元素,但是更手动的循环将同样容易进行。

You need to adjust your dicts, so that instead of having: 您需要调整自己的命令,而不是:

{'event': 'up', '53118': 'Harry'}

you have: 你有:

{'event': 'up', 'id': '53118', 'name': 'Harry'}

resulting in: 导致:

In [23]: df = pd.DataFrame(sample)

In [24]: df
Out[24]: 
    event      id    name
0      up   53118   Harry
1      up   51880   Smith
2    down   51659     Joe
3      up   52983     Sam
4    down   52917   Roger
5      up  314615   Julie
6    left  276298  Andrew
7   right  457249  Carlos
8    down  391485   Jason
9   right   53191  Taylor
10   down   51248   Benjy

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM