I have a program that uses python pandas library to sum two columns individually and compare with 3rd column and give result. Its below:
import pandas as pd
df = pd.read_csv(r'xl1.csv', skipinitialspace=True, sep=',')
sum1 = df['Gross_Salary'].sum()
sum2 = df['Deduction'].sum()
diff = sum1 - sum2
if diff == df['Net_Salary'].sum():
print("Pass")
else:
print("Fail")
Its working as required. However, my requirement is to compare each cell of both columns, subtract them, and then compare with the 3rd column. If the value matches then "pass", otherwise "fail"
Below is the CSV data:
Gross_Salary Deduction Net_Salary
100 20 80
2000 200 1500
300 0 300
In the 2nd row,there is a data mismatch intentionally.
I understand that I need to use for loop to go over each row. I did try to use the loop as below
for i in pd.read_csv(r'xl1.csv', skipinitialspace=True, sep=',')
However, not able to apply the logic beyond that.
Please help,
Thank you
You can create a new column storing the test result using a vectorized implementation. Namely:
df['Result'] = ((df['Gross_Salary'] - df['Deduction']) == df['Net_Salary']).astype(int)
df['Result'] = df['Result'].map({1: 'Pass', 0: 'Fail'})
or similarly, if you also have numpy dependency:
df['Result'] = np.where(df['Gross_Salary'] - df['Deduction'] == df['Net_Salary'],
'Pass', 'Fail')
Pandas implementation
df['Gross_Salary'] - df['Deduction']
computes the elementwise difference of the two columns. Remark that pandas automatically aligns elements with the same index.df['Net_Salary']
using the ==
operator. This will yield Series (column) with boolean values.int
type so that True -> 1
and False -> 0
Pass
and 0 to Fail
.Numpy implementation
Applying one of these to your example:
df
Gross_Salary Deduction Net_Salary Result
0 100 20 80 Pass
1 2000 200 1500 Fail
2 300 0 300 Pass
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.