[英]Expand pandas dataframe column of dict into dataframe columns
I have a Pandas DataFrame where one column is a Series of dicts, like this:我有一个 Pandas DataFrame,其中一列是一系列字典,如下所示:
colA colB colC
0 7 7 {'foo': 185, 'bar': 182, 'baz': 148}
1 2 8 {'foo': 117, 'bar': 103, 'baz': 155}
2 5 10 {'foo': 165, 'bar': 184, 'baz': 170}
3 3 2 {'foo': 121, 'bar': 151, 'baz': 187}
4 5 5 {'foo': 137, 'bar': 199, 'baz': 108}
I want the foo
, bar
and baz
key-value pairs from the dicts to be columns in my dataframe, such that I end up with this:我希望字典中的foo
、 bar
和baz
键值对成为我的 dataframe 中的列,这样我最终得到的是:
colA colB foo bar baz
0 7 7 185 182 148
1 2 8 117 103 155
2 5 10 165 184 170
3 3 2 121 151 187
4 5 5 137 199 108
How do I do that?我怎么做?
df = df.drop('colC', axis=1).join(pd.DataFrame(df.colC.values.tolist()))
We start by defining the DataFrame to work with, as well as a importing Pandas:我们首先定义要使用的 DataFrame 以及导入的 Pandas:
import pandas as pd
df = pd.DataFrame({'colA': {0: 7, 1: 2, 2: 5, 3: 3, 4: 5},
'colB': {0: 7, 1: 8, 2: 10, 3: 2, 4: 5},
'colC': {0: {'foo': 185, 'bar': 182, 'baz': 148},
1: {'foo': 117, 'bar': 103, 'baz': 155},
2: {'foo': 165, 'bar': 184, 'baz': 170},
3: {'foo': 121, 'bar': 151, 'baz': 187},
4: {'foo': 137, 'bar': 199, 'baz': 108}}})
The column colC
is a pd.Series
of dicts, and we can turn it into a pd.DataFrame
by turning each dict into a pd.Series
:列colC
是pd.Series
类型的字典中,我们可以把它变成一个pd.DataFrame
通过转动每个字典成pd.Series
:
pd.DataFrame(df.colC.values.tolist())
# df.colC.apply(pd.Series). # this also works, but it is slow
which gives the pd.DataFrame
:这给出了pd.DataFrame
:
foo bar baz
0 154 190 171
1 152 130 164
2 165 125 109
3 153 128 174
4 135 157 188
So all we need to do is:所以我们需要做的就是:
colC
into a pd.DataFrame
将colC
变成pd.DataFrame
colC
from df
从df
删除原始colC
colC
with df
使用df
加入转换colC
That can be done in a one-liner:这可以在单行中完成:
df = df.drop('colC', axis=1).join(pd.DataFrame(df.colC.values.tolist()))
With the contents of df
now being the pd.DataFrame
: df
的内容现在是pd.DataFrame
:
colA colB foo bar baz
0 2 4 154 190 171
1 4 10 152 130 164
2 4 10 165 125 109
3 3 8 153 128 174
4 10 9 135 157 188
I faced the same challenge recently and I managed to do it manually using apply
and join
.我最近遇到了同样的挑战,我设法使用apply
和join
手动完成。
import pandas as pd
def expand_dict_column(df: pd.DataFrame, column) -> pd.DataFrame:
df.drop(columns=[column], inplace=False).join(
df.apply(lambda x: pd.Series(x[column].values(), index=x[column].keys()), axis=1))
In the case of the columns of the question it would look like this:对于问题的列,它看起来像这样:
df.drop(columns=["colC"], inplace=False).join(
df.apply(lambda x: pd.Series(x["colC"].values(), index=x["colC"].keys()), axis=1))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.