[英]Python - Pandas filter and group by
Compare first Series.mode
per groups by original column, filter and if necessary add not filtered rows with assign bin
to cluster-2
:按原始列比较每组的第一个
Series.mode
,过滤并在必要时添加未过滤的行,并将分配bin
分配给cluster-2
:
print (df)
file cluster-1 cluster-2
0 A 1 2
1 D 1 2
2 G 2 4
3 B 3 1
4 E 3 2
5 J 3 1
m = (df.groupby('cluster-1')['cluster-2']
.transform(lambda x: x.mode().iat[0])
.eq(df['cluster-2']))
df = (df[m].append(df[~m].assign(**{'cluster-1':'bin'}), ignore_index=True)
.rename(columns={'cluster-1':'cluster'})
.drop('cluster-2', axis=1))
print (df)
file cluster
0 A 1
1 D 1
2 G 2
3 B 3
4 J 3
5 E bin
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.