简体   繁体   中英

Filter MultiIndex DataFrame by level values

I have a DataFrame with MultiIndex index and columns. I would like to select the subset of this DataFrame where index at level 0 is not in a given list (so I want to exclude rows for which value at level 0 doesnt belong to a list) and column at level 1 is not in a given list (so exclude columns for which value at level 1 doesnt belong to a given list. How can I do that?

Here is one way to do it:

import numpy as np
import pandas as pd

# Setup
index = pd.MultiIndex.from_product(
    [["bar", "baz", "foo", "qux"], ["one", "two"]], names=["first", "second"]
)

df = pd.DataFrame(np.random.randn(6, 6), index=index[:6], columns=index[:6])
print(df)
# Output
first              bar                 baz                 foo
second             one       two       one       two       one       two
first second
bar   one    -0.970334  0.434532  1.277209 -1.622681  0.385986  0.894817
      two     2.037842  0.388206 -1.897472  0.291751 -0.860631 -0.743974
baz   one     1.483088 -0.797434  0.421217  0.911051  0.645517  0.643298
      two    -0.154445 -0.769389 -0.287669 -0.602577  1.064863  0.013943
foo   one     0.023713  0.336001  0.821779  0.183035  0.144324 -1.297155
      two    -1.305088 -3.731492  0.060215 -1.280722 -1.498417  2.103376
# Select level 0 index found in `list1` and level 1 columns found in `list2`:
```python
list1 = ["bar", "foo"]
list2 = ["one"]


new_df = df.loc[
    (
        [row for row in df.index.get_level_values(0).unique() if row in list1],
        df.index.get_level_values(1).unique(),
    ),
    (
        df.columns.get_level_values(0).unique(),
        [col for col in df.columns.get_level_values(1).unique() if col in list2],
    ),
]
print(new_df)
# Output
first              bar       baz       foo
second             one       one       one
first second
bar   one    -0.970334  1.277209  0.385986
      two     2.037842 -1.897472 -0.860631
foo   one     0.023713  0.821779  0.144324
      two    -1.305088  0.060215 -1.498417

You can learn more about MultiIndexing in Pandas documentation .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM