[英]Updating a column if condition is met with pandas
I have a data frame to work on and I am performing several checks.我有一个要处理的数据框,并且正在执行多项检查。
I am checking whether the repeating values under the columns "A", "B" and "C" present the same number but with opposite sign under column D.我正在检查“A”、“B”和“C”列下的重复值是否在 D 列下呈现相同的数字但符号相反。
A一个 | B乙 | C C | D D | E乙 |
---|---|---|---|---|
1111 1111 | AAA AAA | 123 123 | 0.01 0.01 | comment to be replaced评论被替换 |
2222 2222 | BBB BBB | 456 456 | 5 5 | comment to be replaced评论被替换 |
3333 3333 | CCC CCC | 789 789 | 10 10 | don't do anything什么都不做 |
1111 1111 | AAA AAA | 123 123 | -0.01 -0.01 | comment to be replaced评论被替换 |
2222 2222 | BBB BBB | 456 456 | -5 -5 | comment to be replaced评论被替换 |
3333 3333 | CCC CCC | 789 789 | -9 -9 | don't do anything什么都不做 |
Please see my code below.请在下面查看我的代码。 When I try replacing the comment under the column "E", it does not work.当我尝试替换“E”列下的评论时,它不起作用。 I am sure that I am doing something wrong.我确定我做错了什么。 I am fully aware of the fact that I haven't written the code in the most efficient way, I am still a newbie.我完全意识到我没有以最有效的方式编写代码,我仍然是新手。 Would you be able to help me with both a more efficient way to achieve this and, out of curiosity, how this could be achieved if I decided to keep using this "non-efficient" way?您能否帮助我以更有效的方式实现这一目标,并且出于好奇,如果我决定继续使用这种“非高效”方式,如何实现这一目标?
Thank you.谢谢你。
for i in range(0, len(df)-1):
for j in range(i+1, len(df)):
if (df['A'][i] == df['A'][j]) & (df['B'][i] == df['B'][j]) & (df['C'][i] == df['C'][j]) & (df['D'][i] + df['D'][j] = 0) :
df['E'][i] = 'it works!'
We can group
the dataframe on columns A
, B
, C
along with series of absolute values in column D
then transform
the column D
using sum
( because if the pairs have opposite sign then there sum must be zero ) to check for the presence of pairs having same magnitude but opposite sign我们可以在列A
, B
, group
C
与列D
中的一系列绝对值一起分组,然后使用sum
transform
列D
(因为如果对具有相反的符号,则总和必须为零)以检查对的存在大小相同但符号相反
df['E'] = df.groupby(['A', 'B', 'C', df['D'].abs()])['D'].transform('sum').eq(0)
A B C D E
0 1111 AAA 123 0.01 True
1 2222 BBB 456 5.00 True
2 3333 CCC 789 10.00 False
3 1111 AAA 123 -0.01 True
4 2222 BBB 456 -5.00 True
5 3333 CCC 789 -9.00 False
This works in case you have more than one pair in E or if there is 1 positive and multiple negative and vice versa.这适用于您在 E 中有不止一对,或者如果有 1 个正数和多个负数,反之亦然。
import pandas as pd
import numpy as np
df_1 = df[df['D'] >= 0].copy().reset_index()
df_2 = df[df['D'] < 0].copy().reset_index()
df_2['D'] = -df_2['D']
indexes = df_1.merge(df_2, on=['A', 'B', 'C', 'D'], how='inner')[['index_x', 'index_y']].values.tolist()
indexes = [item for sublist in indexes for item in sublist]
df['E_new'] = np.where(df.index.isin(indexes), 'new comment', df['E'])
print(df)
# A B C D E E_new
# 0 1111 AAA 123 0.01 comment to be replaced new comment
# 1 2222 BBB 456 5.00 comment to be replaced new comment
# 2 3333 CCC 789 10.00 don't do anything don't do anything
# 3 1111 AAA 123 -0.01 comment to be replaced new comment
# 4 2222 BBB 456 -5.00 comment to be replaced new comment
# 5 3333 CCC 789 -9.00 don't do anything don't do anything
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.