简体   繁体   中英

In Python's Pandas, how do I filter a DataFrame on a Field in a MultiIndex

I have this DataFrame that has a MultiIndex with the fiels Hi , Md , Lo

import pandas as pd

df = pd.DataFrame(
    [
        ["A", "X", "C", "7", "3"],
        ["A", "B", "C", "7", "2"],
        ["A", "B", "C", "7", "1"],
        ["A", "X", "C", "7", "3"],
        ["1", "2", "3", "7", "3"],
        ["4", "5", "6", "7", "3"],
    ],
    columns=["Hi", "Md", "Lo", "O", "I"],
).set_index(["Hi", "Md", "Lo"])

It looks like this

          O  I
Hi Md Lo      
A  X  C   7  3
   B  C   7  2
      C   7  1
   X  C   7  3
1  2  3   7  3
4  5  6   7  3

How can I set a filter to get only rows where the Md has the value B

I tried via field name

df[df['Md'] == 'B']

This raises a KeyError, because the "field" is not column, but part of the MultiIndex

df[df.index.map(lambda x: x[1] == 'B')]

works, but is very hacky.

Is there a shorter way to write this filter, maybe using the field name Md or the index position 1 ?

You can use DataFrame.query

df.query("Md == 'B'")

Output

          O  I
Hi Md Lo      
A  B  C   7  2
      C   7  1

Alternative (Might not work with different data)

df.filter(like='B', axis=0)

Here is a way by passing a tuple in to df.loc

df.loc[(slice(None),'B',slice(None))]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM