I have two DataFrames df1
(mxn) and df2
(mx1) as time series and I want to calculate the difference for each column between df1
and df2
which would look like df3
.
import pandas as pd
df1 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
'01K W':[1.2, 0.4, 0.2, -0.4],
'02K W':[3.5, 3.2, 'nan', 'nan'],
'03K W':[-1, -2.3, 0.3, 2.4],
'04K W':[1.5, 2.6, 3.2, 4.2]})
df2 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
'K W':[1, 1.5, 1.2, 0.8]})
df3 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
'01K W':[0.2, 1.1, 1, 1.2],
'02K W':[2.5, 1.7, 'nan', 'nan'],
'03K W':[2, 3.8, 0.9, 1.6],
'04K W':[0.5, 1.1, 2, 3.4]})
Is there an easy way to build a difference column wise?
You can set Date
as index, and use .sub
method:
df1.set_index('Date').sub(df2.set_index('Date')['K W'], axis='rows')
Output:
01K W 02K W 03K W 04K W
Date
2021-01-01 0.2 2.5 -2.0 0.5
2021-01-02 -1.1 1.7 -3.8 1.1
2021-01-03 -1.0 NaN -0.9 2.0
2021-01-04 -1.2 NaN 1.6 3.4
Note : you might want to add astype(float)
after set_index('Date')
to correct your data type.
First you will need to use numeric values, not strings.
It looks like your 'Date' field represents your index. Pandas series can be added/subtracted element-wise based on their shared index so worth calling those out as an index. Then you can simply iterate through your df1 columns to apply df2 to each.
from numpy import nan
import pandas as pd
df1 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
'01K W':[1.2, 0.4, 0.2, -0.4],
'02K W':[3.5, 3.2, nan, nan],
'03K W':[-1, -2.3, 0.3, 2.4],
'04K W':[1.5, 2.6, 3.2, 4.2]})
df2 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
'K W':[1, 1.5, 1.2, 0.8]})
df1 = df1.set_index('Date')
df2 = df2.set_index('Date')
df3 = df1.copy()
for c in df1.columns:
df3[c] = df1[c] - df2['K W']
df3
Yields:
01K W 02K W 03K W 04K W
Date
2021-01-01 0.2 2.5 -2.0 0.5
2021-01-02 -1.1 1.7 -3.8 1.1
2021-01-03 -1.0 NaN -0.9 2.0
2021-01-04 -1.2 NaN 1.6 3.4
Another way to do:
df4 = df1[['01K W', '02K W', '03K W', '04K W']].astype(float).subtract(df2['K W'].astype(float), axis=0).abs().join(df1['Date'])[['Date','01K W', '02K W', '03K W', '04K W']]
print(df4)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.