[英]How to compare each cells of one columns to each cells of another column in a csv file with python?
I have a program that uses python pandas library to sum two columns individually and compare with 3rd column and give result.我有一个程序使用 python pandas 库分别对两列求和并与第三列进行比较并给出结果。 Its below:
它在下面:
import pandas as pd
df = pd.read_csv(r'xl1.csv', skipinitialspace=True, sep=',')
sum1 = df['Gross_Salary'].sum()
sum2 = df['Deduction'].sum()
diff = sum1 - sum2
if diff == df['Net_Salary'].sum():
print("Pass")
else:
print("Fail")
Its working as required.它按要求工作。 However, my requirement is to compare each cell of both columns, subtract them, and then compare with the 3rd column.
但是,我的要求是比较两列的每个单元格,减去它们,然后与第三列进行比较。 If the value matches then "pass", otherwise "fail"
如果值匹配则“通过”,否则“失败”
Below is the CSV data:以下是 CSV 数据:
Gross_Salary Deduction Net_Salary
100 20 80
2000 200 1500
300 0 300
In the 2nd row,there is a data mismatch intentionally.在第 2 行,故意存在数据不匹配。
I understand that I need to use for loop to go over each row.我知道我需要使用 for 循环来遍历每一行。 I did try to use the loop as below
我确实尝试使用如下循环
for i in pd.read_csv(r'xl1.csv', skipinitialspace=True, sep=',')
However, not able to apply the logic beyond that.但是,无法应用除此之外的逻辑。
Please help,请帮忙,
Thank you谢谢
You can create a new column storing the test result using a vectorized implementation.您可以使用矢量化实现创建一个存储测试结果的新列。 Namely:
即:
df['Result'] = ((df['Gross_Salary'] - df['Deduction']) == df['Net_Salary']).astype(int)
df['Result'] = df['Result'].map({1: 'Pass', 0: 'Fail'})
or similarly, if you also have numpy dependency:或者类似地,如果你也有 numpy 依赖:
df['Result'] = np.where(df['Gross_Salary'] - df['Deduction'] == df['Net_Salary'],
'Pass', 'Fail')
Pandas implementation熊猫实现
df['Gross_Salary'] - df['Deduction']
computes the elementwise difference of the two columns. df['Gross_Salary'] - df['Deduction']
计算两列的元素差异。 Remark that pandas automatically aligns elements with the same index.df['Net_Salary']
using the ==
operator.==
运算符将其与df['Net_Salary']
进行比较。 This will yield Series (column) with boolean values.int
type so that True -> 1
and False -> 0
int
类型,以便True -> 1
和False -> 0
Pass
and 0 to Fail
.Pass
并将 0 映射到Fail
。 Numpy implementation Numpy 实现
Applying one of these to your example:将其中之一应用于您的示例:
df
Gross_Salary Deduction Net_Salary Result
0 100 20 80 Pass
1 2000 200 1500 Fail
2 300 0 300 Pass
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.