Calculating difference of two data frames for each column of first data frame

Question

assumed I have two dataframes:

df1: 4 columns, n lines

df2: 50 columns, n lines

what is the best way to calculate the difference of each column of df1 to all columns of df2?

My only idea up to now is to merge the tables and create 4*50 new columns with the differences, as a loop. But there has to be a better way, right?

Thanks already! Paul

Answer 1

For this I have created 2 fictive dataframes:

Input Dataframes

df1 = pd.DataFrame({"a":[1,1,1],
                   "b":[2,2,2],
            
                  })

df2 = pd.DataFrame({"aa":[10,10,10],
                   "bb":[20,20,20],
                   "cc":[30,30,30],
                   "dd":[40,40,40],
                    "ee":[50,50,50] 
                  })
print(df1)

    a   b
0   1   2
1   1   2
2   1   2

print(df2)

    aa  bb  cc  dd  ee
0   10  20  30  40  50
1   10  20  30  40  50
2   10  20  30  40  50

Solution

df = pd.concat([df2.sub(df1[i], axis=0) for i in df1.columns],axis =1)
df.columns= [i for i in range(df1.shape[1]*df2.shape[1])]
df

Result

    0   1   2   3   4   5   6   7   8    9
0   9   19  29  39  49  8   18  28  38  48
1   9   19  29  39  49  8   18  28  38  48
2   9   19  29  39  49  8   18  28  38  48

Calculating difference of two data frames for each column of first data frame

Question

1 answers

solution1
1 ACCPTED 2021-05-05 17:56:06

Calculating difference of two data frames for each column of first data frame

Question

1 answers

solution1 1 ACCPTED 2021-05-05 17:56:06

solution1
1 ACCPTED 2021-05-05 17:56:06