简体   繁体   English

如何过滤具有 MultiIndex 级别 2 的多个条目的行?

[英]How to filter the rows with multiple entries of MultiIndex level two?

I have a dataframe, df with a MultiIndex.我有一个带有 MultiIndex 的 dataframe,df。

df.columns
Index(['all', 'month', 'day', 'year'], dtype='object')
        all       month day year
  match             

7   0   10/24/89    10  24  89
8   0   3/7/86      3   7   86
    1   10          NaN NaN 10
9   0   4/10/71     4   10  71
10  0   5/11/85     5   11  85
    1   96          NaN NaN 96
    2   26          NaN NaN 26
11  0   10          NaN NaN 10
    1   4/09/75     4   09  75    
12  0   8/01/98     8   01  98

How can I select the rows with more than 1 entry at the MultiIndex level 2?我如何 select 在 MultiIndex 级别 2 上具有超过 1 个条目的行?

For example, here I need the rows 8,10 and 11.例如,这里我需要第 8,10 和 11 行。

you can use groupby.transform by the first level of index and use len .您可以通过第一级索引使用groupby.transform并使用len Then get True where the len is greater and equal ( ge ) to the value you want (here 2) to get the boolean mask you want and select the rows.然后在len更大且等于( ge )的地方获得True ,以获取您想要的 boolean 掩码和 select 行。

print(df[df.groupby(level=0)['month'].transform(len).ge(2)])
                0  month   day  year
   match                            
8  0       3/7/86    3.0   7.0    86
   1           10    NaN   NaN    10
10 0      5/11/85    5.0  11.0    85
   1           96    NaN   NaN    96
   2           26    NaN   NaN    26
11 0           10    NaN   NaN    10
   1      4/09/75    4.0   9.0    75

Here I use 'month' as column after the groupby operation, but any column in your dataframe would work.在这里,我在 groupby 操作之后使用“月”作为列,但是 dataframe 中的任何列都可以使用。

You can also use groupby.filter and get the same result with:您还可以使用groupby.filter并通过以下方式获得相同的结果:

print(df.groupby(level=0).filter(lambda x: len(x)>=2))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM