[英]How to fill the missing values of a column with the mean of a specific class of another column?
[英]Display missing values of specific column based on another specific column
这是我的问题
假设我在数据帧上有两列,如下所示:
Type | Killed
_______ |________
Dog 1
Dog nan
Dog nan
Cat 4
Cat nan
Cow 1
Cow nan
我想根据类型显示Killed中所有缺失的值并计算它们
我的愿望结果看起来像这样:
Type | Sum(isnull)
Dog 2
Cat 1
Cow 1
反正有没有显示这个?
您可以对value_counts
使用boolean indexing
:
print (df.ix[df.Killed.isnull(), 'Type'].value_counts().reset_index(name='Sum(isnull)'))
index Sum(isnull)
0 Dog 2
1 Cow 1
2 Cat 1
或者聚合size
,似乎更快:
print (df[df.Killed.isnull()]
.groupby('Type')['Killed']
.size()
.reset_index(name='Sum(isnull)'))
Type Sum(isnull)
0 Cat 1
1 Cow 1
2 Dog 2
时间 :
df = pd.concat([df]*1000).reset_index(drop=True)
In [30]: %timeit (df.ix[df.Killed.isnull(), 'Type'].value_counts().reset_index(name='Sum(isnull)'))
100 loops, best of 3: 5.36 ms per loop
In [31]: %timeit (df[df.Killed.isnull()].groupby('Type')['Killed'].size().reset_index(name='Sum(isnull)'))
100 loops, best of 3: 2.02 ms per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.