简体   繁体   中英

Impute missing values in subset of mixed-type DataFrame using a mask

Consider the following code to create two pandas DataFrames a and b :

import pandas as pd
import numpy as np
a = pd.DataFrame(
    [
        ['X', 1, np.nan, 3],
        ['X', 4, 5, 6],
        ['Y', 7, 8, 9]
    ],
    columns = ["Group", "A", "B", "C"]
)

b = pd.DataFrame(
    [
        ['X', 1, 2, 3],
        ['X', 4, 5, np.nan],
        ['X', 7, 8, 9]
    ],
    columns = ["Group", "A", "B", "C"]
)

I would like to replace any NaN values in columns ["A", "B", "C"] if the location is NaN in either DataFrame. That is, I would like to use the following mask:

missing_vals = pd.isnull(a) | pd.isnull(b)
print(missing_vals)
#   Group      A      B      C
#0  False  False   True  False
#1  False  False  False   True
#2  False  False  False  False

I tried:

replacement_value = -1
a[missing_vals] = replacement_value

but that resulted in:

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

I've also tried accessing just the desired columns using a[missing_vals.loc[:, ["A", "B", "C"]]] but produced an error as well.

The desired outputs are:

print(a)
#  Group  A   B   C
#0     X  1  -1   3
#1     X  4   5  -1
#2     Y  7   8   9

print(b)
#  Group  A   B   C
#0     X  1  -1   3
#1     X  4   5  -1
#2     X  7   8   9

Notice that rows 0 and 1 /columns "B" and "C" have been replaced with the replacement_value in both DataFrames.

You can using mask

s=(a.isnull())|(b.isnull())
s
Out[297]: 
   Group      A      B      C
0  False  False   True  False
1  False  False  False   True
2  False  False  False  False

a.mask(s,-1)
Out[299]: 
  Group  A    B  C
0     X  1 -1.0  3
1     X  4  5.0 -1
2     Y  7  8.0  9
b.mask(s,-1)
Out[300]: 
  Group  A  B    C
0     X  1 -1  3.0
1     X  4  5 -1.0
2     X  7  8  9.0

np.where

m = a.isnull() | b.isnull()
pd.DataFrame(np.where(m, -1, a), columns=a.columns)

  Group  A   B   C
0     X  1  -1   3
1     X  4   5  -1
2     Y  7   8   9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM