简体   繁体   中英

Replace np.nan in dataframe with minimum from other series

This is an easy one i'm sure but I cannot get the syntax for df.loc right.

import pandas as pd
import numpy as np

d = { 'data' : [4, 2, 7, np.nan, 7, 6, 5, np.nan, 6, 3, np.nan, 2], 
 'a' : [4, 2, 7, 9, 7, 6, 5, 4, 6, 3, np.nan, 2], 
 'b' : [4, 2, 7, 11, 7, 6, 5, 2, 6, 3, 3, 2]}

df2 = pd.DataFrame(d)

df2.loc[df2.data == np.nan], min(['a', 'b'])

print df2

I want to replace all the np.nan in 'data' with the minimum value from the labels 'a' and 'b'. Note sometimes one of those values will be missing ( np.nan ) also.

Result should be:

     a   b  data
0    4   4     4
1    2   2     2
2    7   7     7
3    9  11     9
4    7   7     7
5    6   6     6
6    5   5     5
7    4   2     2
8    6   6     6
9    3   3     3
10 NaN   3     3
11   2   2     2

你可以通过fillna()得到min()的结果:

df2['data'].fillna(df2[['a', 'b']].min(axis=1), inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM