简体   繁体   English

具有多索引的数据框过滤器:在给定值过滤器的情况下,返回顶级索引级别的所有行

[英]Dataframe filter with multi-index: return all rows at top index level given value filters

I'm looking for the syntax to return all first tier data given multiple end value criteria. 我正在寻找在给定多个最终值标准的情况下返回所有第一层数据的语法。 I've been reading and finding filtering solutions with .loc or .xs but I can quite get the syntax for what I want. 我一直在阅读和查找.loc或.xs的过滤解决方案,但我可以完全根据自己的需要获取语法。 I use to work with xpath and I just want //A[ B [ @x=1 and @y=2]] in essence. 我曾经使用过xpath,本质上只想//A[ B [ @x=1 and @y=2]]

I've tried lots of permutations of syntax I'm familiar with using forms of df.loc df.xs mutlti [], a little with df.index.get_level_values(), etc... 我已经尝试过使用if形式的df.loc df.xs mutlti []熟悉的语法置换,以及使用df.index.get_level_values()等的语法置换。

So from a dataframe like this: xy AB ab 1 2 af 4 5 ac 3 4 bd 1 5
bc 1 2 cd 2 3
因此,从这样的数据帧中: xy AB ab 1 2 af 4 5 ac 3 4 bd 1 5
bc 1 2 cd 2 3
xy AB ab 1 2 af 4 5 ac 3 4 bd 1 5
bc 1 2 cd 2 3

I want to search for a specific combo of x and y and return all rows at the A index level. 我想搜索x和y的特定组合,并返回A索引级别的所有行。

So I want x=1 and y=2 and get 所以我想x = 1和y = 2并得到

xy AB ab 1 2 af 4 5 ac 3 4 bd 1 5 bc 1 2

Because at least 1 single row of a given A matches 因为给定A的至少1个单行匹配

And even better more general solution would be to search for an x value of a particular B and y value of a particular different B. 甚至更好的通用解决方案是搜索特定B的x值和特定不同B的y值。

(trying for more clarity): By this I mean, instead of end level values I'm looking for, I may be interested in combination only specific B values. (为更清晰起见,我尝试这样做):我的意思是说,我可能只对特定的B值感兴趣,而不是寻找的最终水平值。 Below I have B 1 = b and x=3. 下面我有B 1 = b和x = 3。 so I'm mixing matching a value with matching an index value. 所以我将匹配值与索引值进行混合。 Whereas before I limited two end values. 而之前我限制了两个最终值。 Again, I envision this in xpath like //A[ B [ local-name() == b and @x=3] and B[ local-name() == f and @y=5] ] (I think I got that right). 同样,我在xpath中对此进行了设想,例如//A[ B [ local-name() == b and @x=3] and B[ local-name() == f and @y=5] ] (我想没错)。

For example, B 1 =b: x=3 and B 2 =f: y=5 . 例如,B 1 = b:x = 3和B 2 = f:y = 5。 Returning: 返回:

xy AB a b 1 2 a f 4 5 ac 3 4

Thanks! 谢谢!

You can query your dataframe via a couple of steps: 您可以通过以下几个步骤query数据框:

A_idx = df.query('x == 1 & y == 2').index.get_level_values('A')
res = df.query('A in @A_idx')

print(res)

#      x  y
# A B      
# a b  1  2
#   f  4  5
#   c  3  4
# b d  1  5
#   c  1  2

Setup 设定

df = pd.DataFrame([['a', 'b', 1, 2], ['a', 'f', 4, 5], ['a', 'c', 3, 4],
                   ['b', 'd', 1, 5], ['b', 'c', 1, 2], ['c', 'd', 2, 3]],
                  columns=['A', 'B', 'x', 'y'])

df = df.set_index(['A', 'B'])

Using groupby + transform + any 使用groupby + transform + any

df[df.eq({'x':1,'y':2}).groupby(level=0).transform('any').any(1)]
     x  y
A B      
a b  1  2
  f  4  5
  c  3  4
b d  1  5
  c  1  2

You can use groupby on level = 'A' and filter after creating a flag column for each x and y columns if the values you are looking for are in it with numpy.where . 您可以在level ='A'上使用groupby ,并在每个xy列创建一个flag列之后进行filter如果要查找的值是numpy.where

#using @jpp setup
import numpy as np
df['flagx'] = np.where(df.x == 1,1,0)
df['flagy'] = np.where(df.y == 5,1,0)

Now, if you want that both x and y meet the condition for any value of B and the same A , you can use any on each flag and look for both with & : 现在,如果希望xy满足B任何值和A相同A ,则可以在每个标志上使用any并使用&查找两者:

print (df.groupby(level='A').filter(lambda dfg: dfg.flagx.any() & dfg.flagy.any() )
         .drop(['flagx','flagy'],axis=1))
     x  y
A B      
a b  1  2
  f  4  5
  c  3  4
b d  1  5
  c  1  2

If you want that both conditions on x and y are met on the same row, then you can do it by changing the position of the any and the & in the filter : 如果希望在同一行上同时满足xy两个条件,则可以通过更改any&filter的位置来实现:

print (df.groupby(level='A').filter(lambda dfg: (dfg.flagx & dfg.flagy).any() )
         .drop(['flagx','flagy'],axis=1))
     x  y
A B      
b d  1  5
  c  1  2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM