I have a DataFrame
with MultiIndex index and columns. I would like to select the subset of this DataFrame
where index at level 0 is not in a given list (so I want to exclude rows for which value at level 0 doesnt belong to a list) and column at level 1 is not in a given list (so exclude columns for which value at level 1 doesnt belong to a given list. How can I do that?
Here is one way to do it:
import numpy as np
import pandas as pd
# Setup
index = pd.MultiIndex.from_product(
[["bar", "baz", "foo", "qux"], ["one", "two"]], names=["first", "second"]
)
df = pd.DataFrame(np.random.randn(6, 6), index=index[:6], columns=index[:6])
print(df)
# Output
first bar baz foo
second one two one two one two
first second
bar one -0.970334 0.434532 1.277209 -1.622681 0.385986 0.894817
two 2.037842 0.388206 -1.897472 0.291751 -0.860631 -0.743974
baz one 1.483088 -0.797434 0.421217 0.911051 0.645517 0.643298
two -0.154445 -0.769389 -0.287669 -0.602577 1.064863 0.013943
foo one 0.023713 0.336001 0.821779 0.183035 0.144324 -1.297155
two -1.305088 -3.731492 0.060215 -1.280722 -1.498417 2.103376
# Select level 0 index found in `list1` and level 1 columns found in `list2`:
```python
list1 = ["bar", "foo"]
list2 = ["one"]
new_df = df.loc[
(
[row for row in df.index.get_level_values(0).unique() if row in list1],
df.index.get_level_values(1).unique(),
),
(
df.columns.get_level_values(0).unique(),
[col for col in df.columns.get_level_values(1).unique() if col in list2],
),
]
print(new_df)
# Output
first bar baz foo
second one one one
first second
bar one -0.970334 1.277209 0.385986
two 2.037842 -1.897472 -0.860631
foo one 0.023713 0.821779 0.144324
two -1.305088 0.060215 -1.498417
You can learn more about MultiIndexing
in Pandas documentation .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.