[英]Pandas groupby: Concatenate without resizing
I have a pandas data frame with 4 columns: 我有一个4列的pandas数据框:
Col1 Col2 Col3 Col4
A1 B1 C1 X1
A2 B2 C2 X2
A3 B3 C3 X3
A1 B1 C1 X4
A4 B4 C4 X5
A3 B3 C3 X6
I want to identify rows that have same values in col1, col2 and col3 and then concatenate the values in their correspondent col4. 我想确定在col1,col2和col3中具有相同值的行,然后将其对应的col4中的值连接起来。 So the output would be like:
所以输出将是这样的:
Col1 Col2 Col3 Col4
A1 B1 C1 X1, X4
A2 B2 C2 X2
A3 B3 C3 X3, X6
A1 B1 C1 X4, X1
A4 B4 C4 X5
A3 B3 C3 X6, X3
The final shape of the data frame is same as the original data frame. 数据框的最终形状与原始数据框相同。 It would be great if you someone could point me in the right direction.
如果您能指出我正确的方向,那就太好了。 Thanks
谢谢
a = (df
.groupby(['Col1', 'Col2', 'Col3'])['Col4']
.apply(lambda x: ', '.join(sorted(x)))
)
b = (df
.groupby(['Col1', 'Col2', 'Col3'])['Col4']
.apply(lambda x: ', '.join(sorted(x, reverse=True)))
)
pd.concat([a, b]).drop_duplicates().reset_index()
And the output: 并输出:
Col1 Col2 Col3 Col4
0 A1 B1 C1 X1, X4
1 A2 B2 C2 X2
2 A3 B3 C3 X3, X6
3 A4 B4 C4 X5
4 A1 B1 C1 X4, X1
5 A3 B3 C3 X6, X3
Use transform and not apply or agg. 使用transform而不应用或agg。
df['Col4'] = df.groupby(['Col1', 'Col2', 'Col3']).transform(lambda x: ', '.join(x.tolist()))
Col1 Col2 Col3 Col4
0 A1 B1 C1 X1, X4
1 A2 B2 C2 X2
2 A3 B3 C3 X3, X6
3 A1 B1 C1 X1, X4
4 A4 B4 C4 X5
5 A3 B3 C3 X3, X6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.