简体   繁体   中英

How to subtract data frames with different fill values

I need to subtract two Data Frames with different indexes (which causes 'NaN' values when one of the values is missing) and I want to replace the missing values from each Data Frame with different number (fill value). For example, let's say I have df1 and df2:

df1:

    A   B   C
0   0   3   0
1   0   0   4
2   4   0   2

df2:

    A   B   C
0   0   3   0
1   1   2   0
3   1   2   0

subtracted = df1.sub(df2) :

    A   B   C
0   0   0   0
1   -1  -2  4
2   NaN NaN NaN
3   NaN NaN NaN

I want the second row of subtracted to have the values from the second row in df1 and the third row of subtracted to have the value 5.

I expect -

subtracted:

    A   B   C
0   0   0   0
1   -1  -2  4
2   4   0   2
3   5   5   5

I tried using the method sub with fill_value=5 but than in both rows 2 and 3 I'll get 0.

One way would be to reindex df2 setting fill_value to 0 before subtracting, then subtract and fillna with 5 :

ix = pd.RangeIndex((df1.index|df2.index).max()+1)
df1.sub(df2.reindex(ix, fill_value=0)).fillna(5).astype(df1.dtypes)

   A  B  C
0  0  0  0
1 -1 -2  4
2  4  0  2
3  5  5  5

We have to reindex here to get alligned indices. This way we can use the sub method.

idxmin = df2.index.min()
idxmax = df2.index.max()

idx = np.arange(idxmin, idxmax+1)

df1.reindex(idx).sub(df2.reindex(idx).fillna(0)).fillna(5)
     A    B    C
0  0.0  0.0  0.0
1 -1.0 -2.0  4.0
2  4.0  0.0  2.0
3  5.0  5.0  5.0

I found the combine_first method that almost satisfies my needs:

df2.combine_first(df1).sub(df2, fill_value=0)

but still produces only:

    A   B   C
0   0   0   0
1   0   0   0
2   4   0   2
3   0   0   0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM