I have two dataframes df1 and df2: df1 is shown here:
age
0 42
1 52
2 36
3 24
4 73
df2 is shown here:
age
0 0
1 0
2 1
3 0
4 0
I want to replace all the zeros in df2 with their corresponding entries in df1. In more technical words, if the element at a certain index in df2 is zero, then I would want this element to be replaced by the corresponding entry in df1.
Hence, I want df2 to look like:
age
0 42
1 52
2 1
3 24
4 73
I tried using the replace method but it is not working. Please help :) Thanks in advance.
You could use where
:
In [19]: df2.where(df2 != 0, df1)
Out[19]:
age
0 42
1 52
2 1
3 24
4 73
Above, df2 != 0
is a boolean DataFrame.
In [16]: df2 != 0
Out[16]:
age
0 False
1 False
2 True
3 False
4 False
df2.where(df2 != 0, df1)
returns a new DataFrame. Where df2 != 0
is True, the corresponding value of df2
is used. Where it is False, the corresponding value of df1
is used.
Another alternative is to make an assignment with df.loc
:
df2.loc[df2['age'] == 0, 'age'] = df1['age']
df.loc[mask, col]
selects rows of df
where the boolean Series, mask
is True, and where the column label is col
.
In [17]: df2.loc[df2['age'] == 0, 'age']
Out[17]:
0 0
1 0
3 0
4 0
Name: age, dtype: int64
When used in an assignment, such as df2.loc[df2['age'] == 0, 'age'] = df1['age']
, Pandas performs automatic index label alignment. (Notice the index labels above are 0,1,3,4 -- with 2 being skipped). So the values in df2.loc[df2['age'] == 0, 'age']
are replaced by the corresponding values from d1['age']
. Even though d1['age']
is a Series with index labels 0
, 1
, 2
, 3
, and 4
, the 2
is ignored because there is no corresponding index label on the left-hand side.
In other words,
df2.loc[df2['age'] == 0, 'age'] = df1.loc[df2['age'] == 0, 'age']
would work as well, but the added restriction on the right-hand side is unnecessary.
In [30]: df2.mask(df2==0).combine_first(df1)
Out[30]:
age
0 42.0
1 52.0
2 1.0
3 24.0
4 73.0
or "negating" beautiful @unutbu's solution :
In [46]: df2.mask(df2==0, df1)
Out[46]:
age
0 42
1 52
2 1
3 24
4 73
或者尝试mul
df1.mul(np.where(df2==1,0,1)).replace({0:1})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.