assumed I have two dataframes:
df1: 4 columns, n lines
df2: 50 columns, n lines
what is the best way to calculate the difference of each column of df1 to all columns of df2?
My only idea up to now is to merge the tables and create 4*50 new columns with the differences, as a loop. But there has to be a better way, right?
Thanks already! Paul
For this I have created 2 fictive dataframes:
Input Dataframes
df1 = pd.DataFrame({"a":[1,1,1],
"b":[2,2,2],
})
df2 = pd.DataFrame({"aa":[10,10,10],
"bb":[20,20,20],
"cc":[30,30,30],
"dd":[40,40,40],
"ee":[50,50,50]
})
print(df1)
a b
0 1 2
1 1 2
2 1 2
print(df2)
aa bb cc dd ee
0 10 20 30 40 50
1 10 20 30 40 50
2 10 20 30 40 50
Solution
df = pd.concat([df2.sub(df1[i], axis=0) for i in df1.columns],axis =1)
df.columns= [i for i in range(df1.shape[1]*df2.shape[1])]
df
Result
0 1 2 3 4 5 6 7 8 9
0 9 19 29 39 49 8 18 28 38 48
1 9 19 29 39 49 8 18 28 38 48
2 9 19 29 39 49 8 18 28 38 48
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.