简体   繁体   English

从列的字典列表中创建数据框列

[英]create dataframe columns from list of dictionaries of the column

I have a dataframe in which one column I have a list of dictionaries, and in this dictionaries I has the name of the of the columns I want to create and its value.我有一个数据框,其中一列有一个字典列表,在这个字典中我有我想要创建的列的名称及其值。

    id  stats                                                            opta_id
0  1307  [{name: 'speed', value: 5},{name: 'strength', value: 10}....]   p176278
1  2410  [{name: 'vision', value: 5}, {name: 'strength', value: 10}....] p118335
2   200  [{name: 'speed', value: 5},{name: 'vision', value: 10}....]     p92187
3  3314  [{name: 'speed', value: 5},{name: 'strength', value: 10}....]   p154976
4  9223  [{name: 'speed', value: 5},{name: 'strength', value: 10}....]   p446990

the list can have up to 80 elements and the length of it is different on each row.该列表最多可以有 80 个元素,并且每行的长度不同。

How could flat this column in order to get something similar to this?怎么能平这个列以获得类似的东西?

    id  stats.speed   stats.strength   stats.vision     .....              opta_id
0  1307  5              10                nan           .....              p176278
1  2410  nan            5                 10            .....              p118335
.
.
.

thank you!谢谢你!

Here I would first build a temporary dataframe from a list of dict created from the stats column, and then concat it with the remaining columns:在这里,我将首先从从 stats 列创建的 dict 列表中构建一个临时数据框,然后将其与其余列连接:

tmp = pd.DataFrame([{d['name']: d['value'] for d in row}
                    for row in df['stats']]).rename(
                        columns=lambda x: 'stats.' + x)

df = pd.concat([df['id'], tmp, df['opta_id']], axis=1)

With the shown data, it gives:使用显示的数据,它给出:

     id  stats.speed  stats.strength  stats.vision  opta_id
0  1307          5.0            10.0           NaN  p176278
1  2410          NaN            10.0           5.0  p118335
2   200          5.0             NaN          10.0   p92187
3  3314          5.0            10.0           NaN  p154976
4  9223          5.0            10.0           NaN  p446990

At the end I have found a solution to my problem.最后我找到了解决我的问题的方法。 First I created at temporal dataframe with every row of the column首先,我用列的每一行在时间数据帧上创建

tmp = pd.concat([pd.DataFrame(x) for x in df['stats']], keys=df.index).reset_index(level=1, drop=True)

afterwards, I do pivot_table with the 'name' column and using as value the stat之后,我使用“名称”列执行 pivot_table 并使用 stat 作为值

pivot = pd.pivot_table(tmp, columns='name', index=df_stats.index.values, values='stat')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM