简体   繁体   English

按 MultiIndex Pandas DataFrame 中的列进行子选择

[英]Subselect by columns in MultiIndex Pandas DataFrame

I have a dataframe that looks like this:我有一个看起来像这样的数据框:

               u1  u2  u3  u4  u5  u6
level0 level1                        
foo1   x1       0   1   0   0   0   0
       x2       0   1   1   0   1   1
foo2   x3       0   1   0   1   0   1
       x4       1   0   0   0   1   1
foo3   x5       1   0   1   0   0   0
       x6       0   1   1   1   0   0
foo4   x7       1   0   0   1   0   1
       x8       0   1   1   1   0   0

I want to subselect only those indices for which u3==1 .我只想子选择那些u3==1索引。 So, as output, I should get something like:所以,作为输出,我应该得到类似的东西:

               u1  u2  u3  u4  u5  u6
level0 level1                        
foo1   
       x2       0   1   1   0   1   1
foo2   

foo3   x5       1   0   1   0   0   0
       x6       0   1   1   1   0   0
foo4   
       x8       0   1   1   1   0   0

I have tried doing:我试过这样做:

idx  = pd.IndexSlice
df.loc[idx[:,:],'u2']==1

which gives:这使:

level0  level1
foo1    x1         True
        x2         True
foo2    x3         True
        x4        False
foo3    x5        False
        x6         True
foo4    x7        False
        x8         True

but I don't know how to use this to index the original dataframe.但我不知道如何使用它来索引原始数据框。

Any help appreciated.任何帮助表示赞赏。

you can use query() method or regular boolean indexing :您可以使用query()方法或常规布尔索引

In [11]: df.query('u2 == 1')
Out[11]:
               u1  u2  u3  u4  u5  u6
level0 level1
foo1   x1       0   1   0   0   0   0
       x2       0   1   1   0   1   1
foo2   x3       0   1   0   1   0   1
foo3   x6       0   1   1   1   0   0
foo4   x8       0   1   1   1   0   0

In [12]: df.loc[df['u2'] == 1]
Out[12]:
               u1  u2  u3  u4  u5  u6
level0 level1
foo1   x1       0   1   0   0   0   0
       x2       0   1   1   0   1   1
foo2   x3       0   1   0   1   0   1
foo3   x6       0   1   1   1   0   0
foo4   x8       0   1   1   1   0   0

using .query() method also allows you to search by index levels:使用.query()方法还允许您按索引级别进行搜索:

In [17]: df.query("level0 in ['foo2','foo3'] and u2 == 1")
Out[17]:
               u1  u2  u3  u4  u5  u6
level0 level1
foo2   x3       0   1   0   1   0   1
foo3   x6       0   1   1   1   0   0

UPDATE:更新:

how can I select all the u?我怎样才能选择所有的你? for which the x1==1 and x3==1?其中 x1==1 和 x3==1?

if you mean u1 and u3 then there are quite a few ways to achieve that:如果你的意思是u1u3那么有很多方法可以实现:

In [8]: df.query("u1 == 1 and u3 == 1")
Out[8]:
               u1  u2  u3  u4  u5  u6
level0 level1
foo3   x5       1   0   1   0   0   0

In [9]: df.loc[(df['u1'] == 1) & (df['u3'] == 1)]
Out[9]:
               u1  u2  u3  u4  u5  u6
level0 level1
foo3   x5       1   0   1   0   0   0

In [10]: df.loc[df[['u1','u3']].eq(1).all(1)]
Out[10]:
               u1  u2  u3  u4  u5  u6
level0 level1
foo3   x5       1   0   1   0   0   0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM