Consider the following code to create two pandas DataFrames a
and b
:
import pandas as pd
import numpy as np
a = pd.DataFrame(
[
['X', 1, np.nan, 3],
['X', 4, 5, 6],
['Y', 7, 8, 9]
],
columns = ["Group", "A", "B", "C"]
)
b = pd.DataFrame(
[
['X', 1, 2, 3],
['X', 4, 5, np.nan],
['X', 7, 8, 9]
],
columns = ["Group", "A", "B", "C"]
)
I would like to replace any NaN
values in columns ["A", "B", "C"]
if the location is NaN
in either DataFrame. That is, I would like to use the following mask:
missing_vals = pd.isnull(a) | pd.isnull(b)
print(missing_vals)
# Group A B C
#0 False False True False
#1 False False False True
#2 False False False False
I tried:
replacement_value = -1
a[missing_vals] = replacement_value
but that resulted in:
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
I've also tried accessing just the desired columns using a[missing_vals.loc[:, ["A", "B", "C"]]]
but produced an error as well.
The desired outputs are:
print(a)
# Group A B C
#0 X 1 -1 3
#1 X 4 5 -1
#2 Y 7 8 9
print(b)
# Group A B C
#0 X 1 -1 3
#1 X 4 5 -1
#2 X 7 8 9
Notice that rows 0
and 1
/columns "B"
and "C"
have been replaced with the replacement_value
in both DataFrames.
You can using mask
s=(a.isnull())|(b.isnull())
s
Out[297]:
Group A B C
0 False False True False
1 False False False True
2 False False False False
a.mask(s,-1)
Out[299]:
Group A B C
0 X 1 -1.0 3
1 X 4 5.0 -1
2 Y 7 8.0 9
b.mask(s,-1)
Out[300]:
Group A B C
0 X 1 -1 3.0
1 X 4 5 -1.0
2 X 7 8 9.0
np.where
m = a.isnull() | b.isnull()
pd.DataFrame(np.where(m, -1, a), columns=a.columns)
Group A B C
0 X 1 -1 3
1 X 4 5 -1
2 Y 7 8 9
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.