[英]I want to flatten JSON column in a Pandas DataFrame
I have an input dataframe df which is as follows: 我有一个输入数据框df,如下所示:
id e
1 {"k1":"v1","k2":"v2"}
2 {"k1":"v3","k2":"v4"}
3 {"k1":"v5","k2":"v6"}
I want to "flatten" the column 'e' so that my resultant dataframe is: 我想“展平”列“ e”,这样我得到的数据帧是:
id e.k1 e.k2
1 v1 v2
2 v3 v4
3 v5 v6
How can I do this? 我怎样才能做到这一点? I tried using json_normalize but did not have much success 我尝试使用json_normalize,但没有成功
Here is a way to use pandas.io.json.json_normalize()
: 这是使用pandas.io.json.json_normalize()
:
from pandas.io.json import json_normalize
df = df.join(json_normalize(df["e"].tolist()).add_prefix("e.")).drop(["e"], axis=1)
print(df)
# e.k1 e.k2
#0 v1 v2
#1 v3 v4
#2 v5 v6
However, if you're column is actually a str
and not a dict
, then you'd first have to map it using json.loads()
: 但是,如果列实际上是str
而不是dict
,则首先必须使用json.loads()
对其进行json.loads()
:
import json
df = df.join(json_normalize(df['e'].map(json.loads).tolist()).add_prefix('e.'))\
.drop(['e'], axis=1)
If your column is not already a dictionary, you could use map(json.loads)
and apply pd.Series
: 如果您的专栏还不是字典,则可以使用map(json.loads)
并应用pd.Series
:
s = df['e'].map(json.loads).apply(pd.Series).add_prefix('e.')
Or if it is already a dictionary, you can apply pd.Series
directly: 或者,如果已经是字典,则可以直接应用pd.Series
:
s = df['e'].apply(pd.Series).add_prefix('e.')
Finally use pd.concat
to join back the other columns: 最后,使用pd.concat
重新加入其他列:
>>> pd.concat([df.drop(['e'], axis=1), s], axis=1).set_index('id')
id e.k1 e.k2
1 v1 v2
2 v3 v4
3 v5 v6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.