简体   繁体   中英

how do i seek to compare two columns in different dataframes at the same time?

I have two dataframes where i am trying to compare two columns (Cat1 and Cat2) and where Cat1 and Cat2 are the same i want to sum the values in the Prc column

So, in the example below, the only two rows that meet the criteria is row 0 and row 4 of df[0] which meets row 1 and row 4 of df[1] and therefore in this case the sum would be 200 for df[0] and 185 for df[1]

df[0]
  Cat1 Cat2 Cat3 Prc
0  11   0    5   100
1  22   2    9   150
2  33   1    8    50
3  44   2    6   200
4  55   1    8   100

df[1]
  Cat1 Cat2 Cat3 Prc
0  66   1    6   120
1  11   0    5    90
2  44   1    6   185
3  77   2    7   145
4  55   1    5    95   

i am able to compare Cat1 in df[0][ vs df[1] using.isin but if that is all i did then i would pick up row 3 in df[0] even though Cat2 is different in df[0] and d[1]

how do i seek to compare two columns in different dataframes at the same time?

these are large dataframes of 500,000 rows x 32 columns each, so i want to avoid creating new dataframes or new columns.

One idea is use DataFrame.merge for intersection of multiple columns, filter column with Prc and sum :

df1 = df[0].merge(df[1], on=['Cat1','Cat2'], suffixes=('_0','_1'))
print (df1)
   Cat1  Cat2  Cat3_0  Prc_0  Cat3_1  Prc_1
0    11     0       5    100       5     90
1    55     1       8    100       5     95

print (df1.filter(like='Prc').sum())
Prc_0    200
Prc_1    185
dtype: int64

Another idea with MultiIndex by columns for intersection with DataFrame.set_index and Index.isin and filtering by boolean indexing :

s1 = df[0].set_index(['Cat1','Cat2'])['Prc']
s2 = df[1].set_index(['Cat1','Cat2'])['Prc']

print (s1[s1.index.isin(s2.index)].sum())
200
print (s2[s2.index.isin(s1.index)].sum())
185

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM