简体   繁体   中英

Subtract two df col if other col have same values

I am newbie to python and pandas. looking for help in subtraction between two different df column Sample data

df1 = pd.DataFrame({'label1': ['lab1', 'lab2', 'lab3', 'lab4'],
                   'compare1': [10, 20, 30, 40],
                   'col3': [100, 50, -30, -50]})

df2 = pd.DataFrame({'lable1': ['lab1', 'lab2', 'lab4', 'lab5'],
                   'compare1': [80,10, 40, 50],
                   'col3': [200, 60, 30, 50]})

df_diff = pd.DataFrame({'label': ['lab1', 'lab2', 'lab4', 'lab3', 'lab5'],
                        'df1_compare1': [10, 20, 40, 30, 0],
                        'df2_compare1': [80,10, 40, 0, 50],
                        'compare': [-70, 10, 0, 30, 50]})

Column name and count are same for both dfs.

Need help in subtraction (or other maths operation) between df1.compare1 and df2.compare2 if label1 and label2 has the same value.

for example, lab1, lab2 and lab 4(though lab 4 on different row) are having same label value. any rows which are not same in both df should show up at the bottom of the df with original value while other column data shows as 0.

Actual data are in CSV/excel MB in size, if possible looking for performance effective method

Does the non-null object has impact on performance?? This is the column type on label.

First merge your dataframes on the column label .

Finally create your compare column by substracting compare1-compare2 :

df_diff = df1[['label1', 'compare1']].merge(df2[['label1', 'compare1']], 
                                            on='label1', 
                                            how='outer',
                                            suffixes=['_df1', '_df2'])


df_diff['compare'] = df_diff['compare1_df1'].fillna(0) - df_diff['compare1_df2'].fillna(0)

  label1  compare1_df1  compare1_df2  compare
0   lab1          10.0          80.0    -70.0
1   lab2          20.0          10.0     10.0
2   lab3          30.0           NaN     30.0
3   lab4          40.0          40.0      0.0
4   lab5           NaN          50.0    -50.0

If you have 20 min time, here's a good read on how merging works: Pandas Merging 101

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM