[英]How to filter the rows with multiple entries of MultiIndex level two?
I have a dataframe, df with a MultiIndex.我有一个带有 MultiIndex 的 dataframe,df。
df.columns
Index(['all', 'month', 'day', 'year'], dtype='object')
all month day year
match
7 0 10/24/89 10 24 89
8 0 3/7/86 3 7 86
1 10 NaN NaN 10
9 0 4/10/71 4 10 71
10 0 5/11/85 5 11 85
1 96 NaN NaN 96
2 26 NaN NaN 26
11 0 10 NaN NaN 10
1 4/09/75 4 09 75
12 0 8/01/98 8 01 98
How can I select the rows with more than 1 entry at the MultiIndex level 2?我如何 select 在 MultiIndex 级别 2 上具有超过 1 个条目的行?
For example, here I need the rows 8,10 and 11.例如,这里我需要第 8,10 和 11 行。
you can use groupby.transform
by the first level of index and use len
.您可以通过第一级索引使用
groupby.transform
并使用len
。 Then get True
where the len
is greater and equal ( ge
) to the value you want (here 2) to get the boolean mask you want and select the rows.然后在
len
更大且等于( ge
)的地方获得True
,以获取您想要的 boolean 掩码和 select 行。
print(df[df.groupby(level=0)['month'].transform(len).ge(2)])
0 month day year
match
8 0 3/7/86 3.0 7.0 86
1 10 NaN NaN 10
10 0 5/11/85 5.0 11.0 85
1 96 NaN NaN 96
2 26 NaN NaN 26
11 0 10 NaN NaN 10
1 4/09/75 4.0 9.0 75
Here I use 'month' as column after the groupby operation, but any column in your dataframe would work.在这里,我在 groupby 操作之后使用“月”作为列,但是 dataframe 中的任何列都可以使用。
You can also use groupby.filter
and get the same result with:您还可以使用
groupby.filter
并通过以下方式获得相同的结果:
print(df.groupby(level=0).filter(lambda x: len(x)>=2))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.