[英]How to add string values of columns with a specific condition in a new column
So I have a dataframe in which there are a couple of columns and a lot of rows.所以我有一个 dataframe ,其中有几列和很多行。
Now I want to create a new column (C) which adds values of another column (A) as a string together if a third column (B) is identical.现在我想创建一个新列 (C),如果第三列 (B) 相同,它将另一列 (A) 的值作为字符串添加在一起。
So each 'group' (that is identical in B) should have a different string than the other groups in that column in the end.因此,每个“组”(在 B 中相同)最后应该具有与该列中的其他组不同的字符串。
A![]() |
B![]() |
New Column C![]() |
---|---|---|
First![]() |
1 ![]() |
First_Third![]() |
Second![]() |
22 ![]() |
Second_Fourth ![]() |
Third![]() |
1 ![]() |
First_Third![]() |
Fourth![]() |
22 ![]() |
Second_Fourth ![]() |
Something like this pseudo code:像这样的伪代码:
for x in df[B]:
if (x "is identical to" x "of another row"):
df[C] = df[C].cat(df[A])
How do I code an algorithm that can do this?我如何编写可以做到这一点的算法?
Try this:尝试这个:
df['C'] = df.groupby('B')['A'].transform(lambda x: '_'.join(x))
You can use:您可以使用:
df['C'] = df.groupby('B')['A'].transform('_'.join)
Or, if you want to keep only unique values:或者,如果您只想保留唯一值:
df['C'] = df.groupby('B')['A'].transform(lambda x: '_'.join(x.unique()))
output: output:
A B C
0 First 1 First_Third
1 Second 22 Second_Fourth
2 Third 1 First_Third
3 Fourth 22 Second_Fourth
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.