Filter MultiIndex DataFrame by level values

Question

I have a DataFrame with MultiIndex index and columns. I would like to select the subset of this DataFrame where index at level 0 is not in a given list (so I want to exclude rows for which value at level 0 doesnt belong to a list) and column at level 1 is not in a given list (so exclude columns for which value at level 1 doesnt belong to a given list. How can I do that?

Answer 1

Here is one way to do it:

import numpy as np
import pandas as pd

# Setup
index = pd.MultiIndex.from_product(
    [["bar", "baz", "foo", "qux"], ["one", "two"]], names=["first", "second"]
)

df = pd.DataFrame(np.random.randn(6, 6), index=index[:6], columns=index[:6])

print(df)
# Output
first              bar                 baz                 foo
second             one       two       one       two       one       two
first second
bar   one    -0.970334  0.434532  1.277209 -1.622681  0.385986  0.894817
      two     2.037842  0.388206 -1.897472  0.291751 -0.860631 -0.743974
baz   one     1.483088 -0.797434  0.421217  0.911051  0.645517  0.643298
      two    -0.154445 -0.769389 -0.287669 -0.602577  1.064863  0.013943
foo   one     0.023713  0.336001  0.821779  0.183035  0.144324 -1.297155
      two    -1.305088 -3.731492  0.060215 -1.280722 -1.498417  2.103376

# Select level 0 index found in `list1` and level 1 columns found in `list2`:
```python
list1 = ["bar", "foo"]
list2 = ["one"]


new_df = df.loc[
    (
        [row for row in df.index.get_level_values(0).unique() if row in list1],
        df.index.get_level_values(1).unique(),
    ),
    (
        df.columns.get_level_values(0).unique(),
        [col for col in df.columns.get_level_values(1).unique() if col in list2],
    ),
]

print(new_df)
# Output
first              bar       baz       foo
second             one       one       one
first second
bar   one    -0.970334  1.277209  0.385986
      two     2.037842 -1.897472 -0.860631
foo   one     0.023713  0.821779  0.144324
      two    -1.305088  0.060215 -1.498417

You can learn more about MultiIndexing in Pandas documentation .

Filter MultiIndex DataFrame by level values

Question

1 answers

solution1
0 2022-07-24 16:43:47

Filter MultiIndex DataFrame by level values

Question

1 answers

solution1 0 2022-07-24 16:43:47

solution1
0 2022-07-24 16:43:47