[英]Pandas multiindex boolean indexing
So given a multiindexed dataframe, I would like to return only rows that satisfy a condition for all levels of the lower index in a multi index. 因此,给定一个多索引数据帧,我只想返回满足多索引中所有较低索引级别的条件的行。 Here is a small working example:
这是一个小的工作示例:
df = pd.DataFrame({'a': [1, 1, 2, 2], 'b': [1, 2, 3, 4], 'c': [0, 2, 2, 2]})
df = df.set_index(['a', 'b'])
print(df)
out: 出:
c
a b
1 1 0
2 2
2 3 2
4 2
Now, I would like to return the entries for which c > 1
. 现在,我想返回
c > 1
的条目。 For instance, I would like to do something like 例如,我想做类似的事情
df[df[c > 1]]
out: 出:
c
a b
1 2 2
2 3 2
4 2
But I want to get 但我想得到
out: 出:
c
a b
2 3 2
4 2
Any thoughts on how to do this in the most efficient way? 关于如何以最有效的方式执行此操作的任何想法?
我最终使用了groupby
:
df.groupby(level=0).filter(lambda x: all([c > 1 for v in x['c']]))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.