[英]Pandas dataframe with MultiIndex: exclude level values
I have a multi-indexed pandas dataframe like the following one. 我有一个多索引的pandas数据帧,如下所示。
import numpy as np
import pandas as pd
arrays = [np.array(['bar', 'bar', 'bar', 'bar', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']),
np.array(['blo', 'bla', 'bla', 'blo', 'blo', 'blu', 'blo', 'bla'])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
df.sort_index(inplace=True)
which returns: 返回:
0 1 2 3
bar one bla 0.478461 1.030308 0.012688 0.137495
blo 0.476041 -1.679848 1.346798 0.143225
two bla 1.148882 -2.074197 -2.567959 1.258016
blo 1.062280 3.846096 -0.346636 1.170822
foo one blo -0.761327 0.262105 0.151554 1.066616
two blu 1.431951 0.043307 -0.326498 2.402536
qux one blo -0.622017 -0.566930 0.417977 -0.345238
two bla 0.129273 -0.181396 -0.758381 0.995827
Now I want to select a subset by using a slice object: 现在我想通过使用切片对象来选择子集:
idx = pd.IndexSlice
subset = df.loc[idx[['bar'], :, :], :]
This returns: 返回:
0 1 2 3
bar one bla 0.478461 1.030308 0.012688 0.137495
blo 0.476041 -1.679848 1.346798 0.143225
two bla 1.148882 -2.074197 -2.567959 1.258016
blo 1.062280 3.846096 -0.346636 1.170822
Now I want to exclude all rows having "blo" as level value. 现在我想排除所有具有“blo”作为级别值的行。 I know that I could select everything but the 'blo' values but my real dataframe is very big and I only know the level values which should not appear in the subset.
我知道我可以选择除“blo”值之外的所有值,但我的真实数据帧非常大,我只知道不应出现在子集中的级别值。
What's the easiest way to exclude certain level values from the subset? 从子集中排除某些级别值的最简单方法是什么?
Thanks in advance! 提前致谢!
IIUC,也许你可以掩饰你的子集:
subset = subset.iloc[subset.index.get_level_values(2) != 'blo']
You can do it this way: 你可以这样做:
In [263]:
subset.loc[subset.index.get_level_values(2) != 'blo']
Out[263]:
0 1 2 3
bar one bla -1.039335 -1.124656 0.057114 -0.284754
two bla 0.007208 -0.403559 -1.317075 -0.340171
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.