简体   繁体   English

Pandas multiIndex 按聚合切片

[英]Pandas multiIndex slicing by aggregate

I have a pandas Series (S) that has an index like:我有一个熊猫系列(S),它的索引如下:

bar  one  a
          b
     two  a
          b
baz  one  a
.
.

I have a conditional function that returns a lower dimensional index.我有一个返回较低维度索引的条件函数。 What I am doing is performing something like S.groupby(level=(0,1)).median() > 1我正在做的是执行类似S.groupby(level=(0,1)).median() > 1

This returns a Series with Index like so:这将返回一个带有索引的系列,如下所示:

bar  one 
baz  two
foo  one 
. 
. 

How do I slice the original Series with the lower dimensional index?如何使用较低维索引对原始系列进行切片?

I know I can reset index and select rows using .isin but I would like to use MultiIndex if possible.我知道我可以使用 .isin 重置索引和选择行,但如果可能的话,我想使用 MultiIndex。

Thanks in advance!提前致谢!

=== ===

Here is what the actual Series (s) looks like:以下是实际系列的样子:

BATCH    ITEM  SEQ   X   Y 
D1M2     765   6005  -5   0    5.085
         769   6005  -3  -2    6.174
         767   6005  -4  -1    5.844
         769   6005  -3  -1    5.702
                     -4   2    5.154
         767   6005  -3   2    5.337
                     -2   4    5.683
                      3   0    6.178
         769   6005  -3   2    5.128
         765   6005   1  -4    4.791

I perform the following operation:我执行以下操作:

sm = s.groupby(level=(0,1,2)).median()
sigma = sm.std()
sms = sm[sm - sm.median() < sigma/2]

Now sms looks like:现在短信看起来像:

BATCH    ITEM  SEQ 
D1M2     765   6005    4.938
         769   6005    5.428

Now I want to slice the series s that match the index in sms only .现在我想对与 sms 中的索引匹配的 series 进行切片。

So I want this slice of s (that matches the index of sms ):所以我想要这片s (与sms的索引匹配):

BATCH    ITEM  SEQ   X   Y 
D1M2     765   6005  -5   0    5.085
         769   6005  -3  -2    6.174
                     -3  -1    5.702
                     -4   2    5.154
                     -3   2    5.128
         765   6005   1  -4    4.791

It's possible only if your index levels are the same which is not the case here because in s , you have ['BATCH', 'ITEM', 'SEQ', 'X', 'Y'] and in sms , you have only ['BATCH', 'ITEM', 'SEQ'] .仅当您的索引级别相同时才有可能,这不是这里的情况,因为在s中,您有['BATCH', 'ITEM', 'SEQ', 'X', 'Y']而在sms中,您只有['BATCH', 'ITEM', 'SEQ']

You have to drop X and Y levels before to match indexes:您必须先删除XY级别才能匹配索引:

# Statically
>>> s[s.index.droplevel(['X', 'Y']).isin(sms.index)]

# Dynamically
>>> s[s.index.droplevel(s.index.names.difference(sms.index.names)).isin(sms.index)]

# Output
BATCH  ITEM  SEQ   X   Y 
DIM2   765   6005  -5   0    5.085
       769   6005  -3  -2    6.174
                       -1    5.702
                   -4   2    5.154
                   -3   2    5.128
       765   6005   1  -4    4.791
dtype: float64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM