[英]How to Fill Numeric missing Values In Pandas Based On Another Column
在这里我需要在 pandas 中输入数值列
样本数据:
Age Time_of_service
42 4
24 5
nan 27
26 4
31 5
54 21
21 2
Nan 32
45 18
19 0
65 35
nan 3
这里 Age 和 Time_of_Service 列都高度相关。 根据以下条件,我需要估算缺失值
Time_of_Service >30
age = 60
Time_of_Service in (20,30)
age = 45
Time_of_Service in (10,20)
age = 35
Time_of_Service in (0,10)
age = 25
如何使用 Python 根据上述条件估算缺失值?
使用cut
进行分箱,然后将 output 转换为整数并用Series.fillna
替换Age
列中的缺失值:
bins = [0,10,20,30,np.inf]
labels = [25,35,45,60]
new = pd.cut(df['Time_of_service'], bins=bins, labels=labels, include_lowest=True)
df['Age'] = df['Age'].fillna(new.astype(int))
print (df)
Age Time_of_service
0 42.0 4
1 24.0 5
2 45.0 27
3 26.0 4
4 31.0 5
5 54.0 21
6 21.0 2
7 60.0 32
8 45.0 18
9 19.0 0
10 65.0 35
11 25.0 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.