简体   繁体   English

Pandas - 检查列中的值是否存在于 MultiIndex 数据帧的任何索引中

[英]Pandas - Check if value from a column exists in any index of a MultiIndex dataframe

I got a MultiIndex dataframe called df with two indexes (index1, index2).我有一个名为 df 的 MultiIndex 数据帧,它有两个索引(index1、index2)。 I want to search by row to check if the value in the Column exists in any of the multi-index.我想按行搜索以检查列中的值是否存在于任何多索引中。

This is what df looks like这就是 df 的样子

                Column
index1 index2
   a     b         b
         c         e
         d         c
   f     e         e
         g         e
         h         f

In terms of the boolean value, this is what I want to generate in order to filter the dataframe就布尔值而言,这是我想要生成的以过滤数据框

                Column
index1 index2
   a     b       True
         c       False
         d       False
   f     e       True
         g       False
         h       True (f is in the index1)

The final output should be like this:最终输出应该是这样的:

                Column
index1 index2
   a     b         b
   f     e         e
         h         f

Is there any good practice to handle this?有什么好的做法来处理这个问题吗?

You can use get_level_values :您可以使用get_level_values

m1 = df.index.get_level_values('index1') == df['Column']
m2 = df.index.get_level_values('index2') == df['Column']
out = df[m1|m2]
print(out)

# Output
              Column
index1 index2       
a      b           b
f      e           e
       h           f

Generic way通用方式

import numpy as np

out = df[np.logical_or.reduce([list(lvl) == df['Column'] for lvl in zip(*df.index)])]
print(out)

# Output
              Column
index1 index2       
a      b           b
f      e           e
       h           f

Use MultiIndex.to_frame with DataFrame.eq for compare all levels and DataFrame.any for test if at least one level is match:使用MultiIndex.to_frameDataFrame.eq比较所有级别,使用DataFrame.any测试是否至少有一个级别匹配:

df1 = df[df.index.to_frame().eq(df['Column'], axis=0).any(axis=1)]
print (df1)
              Column
index1 index2       
a      b           b
f      e           e
       h           f

Or use list comprehension with in for test if exist value of column in index:或者使用列表推导和in测试索引中是否存在列值:

df1 = df[[v in k for k, v in df['Column'].items()]]
print (df1)
              Column
index1 index2       
a      b           b
f      e           e
       h           f

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM