简体   繁体   English

熊猫groupby:串联而不调整大小

[英]Pandas groupby: Concatenate without resizing

I have a pandas data frame with 4 columns: 我有一个4列的pandas数据框:

Col1 Col2 Col3 Col4
A1    B1   C1   X1
A2    B2   C2   X2
A3    B3   C3   X3
A1    B1   C1   X4
A4    B4   C4   X5
A3    B3   C3   X6

I want to identify rows that have same values in col1, col2 and col3 and then concatenate the values in their correspondent col4. 我想确定在col1,col2和col3中具有相同值的行,然后将其对应的col4中的值连接起来。 So the output would be like: 所以输出将是这样的:

Col1 Col2 Col3 Col4
A1    B1   C1   X1, X4
A2    B2   C2   X2
A3    B3   C3   X3, X6
A1    B1   C1   X4, X1
A4    B4   C4   X5
A3    B3   C3   X6, X3

The final shape of the data frame is same as the original data frame. 数据框的最终形状与原始数据框相同。 It would be great if you someone could point me in the right direction. 如果您能指出我正确的方向,那就太好了。 Thanks 谢谢

a = (df
     .groupby(['Col1', 'Col2', 'Col3'])['Col4']
     .apply(lambda x: ', '.join(sorted(x)))
    )
b = (df
     .groupby(['Col1', 'Col2', 'Col3'])['Col4']
     .apply(lambda x: ', '.join(sorted(x, reverse=True)))
    )
pd.concat([a, b]).drop_duplicates().reset_index()

And the output: 并输出:

  Col1 Col2 Col3    Col4
0   A1   B1   C1  X1, X4
1   A2   B2   C2      X2
2   A3   B3   C3  X3, X6
3   A4   B4   C4      X5
4   A1   B1   C1  X4, X1
5   A3   B3   C3  X6, X3

Use transform and not apply or agg. 使用transform而不应用或agg。

df['Col4'] = df.groupby(['Col1', 'Col2', 'Col3']).transform(lambda x: ', '.join(x.tolist()))

  Col1 Col2 Col3    Col4
0   A1   B1   C1  X1, X4
1   A2   B2   C2      X2
2   A3   B3   C3  X3, X6
3   A1   B1   C1  X1, X4
4   A4   B4   C4      X5
5   A3   B3   C3  X3, X6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM