我想展平Pandas DataFrame中的JSON列

Question

I have an input dataframe df which is as follows: 我有一个输入数据框df，如下所示：

id  e
1   {"k1":"v1","k2":"v2"}
2   {"k1":"v3","k2":"v4"}
3   {"k1":"v5","k2":"v6"}

I want to "flatten" the column 'e' so that my resultant dataframe is: 我想“展平”列“ e”，这样我得到的数据帧是：

id  e.k1    e.k2
1   v1  v2
2   v3  v4
3   v5  v6

How can I do this? 我怎样才能做到这一点？ I tried using json_normalize but did not have much success 我尝试使用json_normalize，但没有成功

Answer 1

Here is a way to use pandas.io.json.json_normalize() : 这是使用pandas.io.json.json_normalize() ：

from pandas.io.json import json_normalize
df = df.join(json_normalize(df["e"].tolist()).add_prefix("e.")).drop(["e"], axis=1)
print(df)
#  e.k1 e.k2
#0   v1   v2
#1   v3   v4
#2   v5   v6

However, if you're column is actually a str and not a dict , then you'd first have to map it using json.loads() : 但是，如果列实际上是str而不是dict ，则首先必须使用json.loads()对其进行json.loads() ：

import json
df = df.join(json_normalize(df['e'].map(json.loads).tolist()).add_prefix('e.'))\
    .drop(['e'], axis=1)

Answer 2

If your column is not already a dictionary, you could use map(json.loads) and apply pd.Series : 如果您的专栏还不是字典，则可以使用map(json.loads)并应用pd.Series ：

s = df['e'].map(json.loads).apply(pd.Series).add_prefix('e.')

Or if it is already a dictionary, you can apply pd.Series directly: 或者，如果已经是字典，则可以直接应用pd.Series ：

s = df['e'].apply(pd.Series).add_prefix('e.')

Finally use pd.concat to join back the other columns: 最后，使用pd.concat重新加入其他列：

>>> pd.concat([df.drop(['e'], axis=1), s], axis=1).set_index('id')    
id e.k1 e.k2
1    v1   v2
2    v3   v4
3    v5   v6

我想展平Pandas DataFrame中的JSON列

问题描述

2 个解决方案

解决方案1
7 2018-04-13 18:29:06

解决方案2
4 2018-04-13 18:19:39

我想展平Pandas DataFrame中的JSON列

问题描述

2 个解决方案

解决方案1 7 2018-04-13 18:29:06

解决方案2 4 2018-04-13 18:19:39

解决方案1
7 2018-04-13 18:29:06

解决方案2
4 2018-04-13 18:19:39