[英]Pandas: fill in missing values by Mode having 'index out of bounds' error
Suppose I have a following DataFrame: 假设我有以下DataFrame:
Sample=pd.DataFrame({'Gender':['Male','Male','Male','Male','Female','Female','Male','Male'],
'Married':['No','Yes','Yes','Yes','No','No','Yes','Yes'],
'Dependents':['1','1','1','0','3+','3+','1','1'],
'Education':['Not Graduate','Graduate','Graduate','Graduate','Not Graduate','Not Graduate','Graduate','Graduate'],
'ApplicantIncome':[3596,3717,4166,2400,3333,6000,1234,4567],
'Credit_History':['1',np.nan,'0','1',np.nan,'1',np.nan,'0']})
ApplicantIncome Credit_History Dependents Education Gender Married
0 3596 1 1 Not Graduate Male No
1 3717 NaN 1 Graduate Male Yes
2 4166 0 1 Graduate Male Yes
3 2400 1 0 Graduate Male Yes
4 3333 NaN 3+ Not Graduate Female No
5 6000 1 3+ Not Graduate Female No
6 1234 NaN 1 Graduate Male Yes
7 4567 0 1 Graduate Male Yes
I would like to fill in NaN with Mode value in ['Gender','Married','Dependents','Education'] group. 我想在['性别','已婚','受抚养者',''教育']组中用Mode值填写NaN 。
I wrote the code below: 我写了下面的代码:
Sample['Credit_History']=Sample.groupby(['Gender','Married','Dependents','Education']).transform(lambda x:
x.fillna(x.mode()[0]))['Credit_History']
An error message about out of bounds popped up: 弹出错误消息:
IndexError: ('index out of bounds', 'occurred at index ApplicantIncome')
Any idea about how to fix my code above? 关于如何修复上面的代码的任何想法吗? Thanks! 谢谢!
You can use a simple code to achieve what you want. 您可以使用简单的代码来实现所需的功能。 df["credithistory"].fillna(df["credithistory"].mode())
Don't forget to import numpy. 不要忘记导入numpy。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.