[英]Convert a column containing a list of dictionaries to multiple columns in pandas dataframe
I have a Pandas dataframe like : 我有一个Pandas数据框,如:
pd.DataFrame({'a':[1,2], 'b':[[{'c':1,'d':5},{'c':3, 'd':7}],[{'c':10,'d':50}]]})
Out[2]:
a b
0 1 [{u'c': 1, u'd': 5}, {u'c': 3, u'd': 7}]
1 2 [{u'c': 10, u'd': 50}]
And I want to expand the 'b' column and repeat 'a' column if there are more than one element in 'b' as follow: 如果'b'中有多个元素,我想扩展'b'列并重复'a'列,如下所示:
Out[2]:
a c d
0 1 1 5
1 1 3 7
2 2 10 50
I tried to use apply function on each row but I was not successful, apparently apply convert one row to one row. 我尝试在每一行使用apply函数,但我没有成功,显然应用将一行转换为一行。
You can use concat
with list comprehension
: 你可以使用concat
和list comprehension
:
df = pd.concat([pd.DataFrame(x) for x in df['b']], keys=df['a'])
.reset_index(level=1, drop=True).reset_index()
print (df)
a c d
0 1 1 5
1 1 3 7
2 2 10 50
EDIT: 编辑:
If index is unique, then is possible use join
for all columns: 如果index是唯一的,则可以对所有列使用join
:
df1 = pd.concat([pd.DataFrame(x) for x in df['b']], keys=df.index)
.reset_index(level=1,drop=True)
df = df.drop('b', axis=1).join(df1).reset_index(drop=True)
print (df)
a c d
0 1 1 5
1 1 3 7
2 2 10 50
I try simplify solution: 我尝试简化解决方案:
l = df['b'].str.len()
df1 = pd.DataFrame(np.concatenate(df['b']).tolist(), index=np.repeat(df.index, l))
df = df.drop('b', axis=1).join(df1).reset_index(drop=True)
print (df)
a c d
0 1 1 5
1 1 3 7
2 2 10 50
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.