简体   繁体   English

具有MultiIndex列的Pandas DataFrame中的布尔索引

[英]Boolean indexing in Pandas DataFrame with MultiIndex columns

I have a DataFrame with MultiIndex columns: 我有一个带有MultiIndex列的DataFrame:

import numpy as np
import pandas as pd

columns = pd.MultiIndex.from_arrays([['n1', 'n1', 'n2', 'n2'], ['p', 'm', 'p', 'm']])
values = [
    [1,      2,  3,      4],
    [np.nan, 6,  7,      8],
    [np.nan, 10, np.nan, 12],
]
df = pd.DataFrame(values, columns=columns)
    n1       n2    
     p   m    p   m
0  1.0   2  3.0   4
1  NaN   6  7.0   8
2  NaN  10  NaN  12

Now I want to set m to NaN whenever p is NaN . 现在我想将p设置为NaN时将m设置为NaN Here's the result I'm looking for: 这是我要寻找的结果:

    n1        n2     
     p    m    p    m
0  1.0  2.0  3.0  4.0
1  NaN  NaN  7.0  8.0
2  NaN  NaN  NaN  NaN

I know how to find out where p is NaN , for example using 我知道如何找出pNaN ,例如使用

mask = df.xs('p', level=1, axis=1).isnull()
      n1     n2
0  False  False
1   True  False
2   True   True

However, I don't know how to use this mask to set the corresponding m values in df to NaN . 但是,我不知道如何使用此掩码将df的相应m值设置为NaN

You can use pd.IndexSlice to obtain a boolean ndarray indicating whether values are NaN or not in the p column on level 1 and then replacing False to NaN , and also to replace the values in m by multiplying the result: 您可以使用pd.IndexSlice获取一个布尔pd.IndexSlice ,该布尔ndarray指示级别1p列中的值是否为NaN ,然后将False替换为NaN ,还可以通过将结果相乘来替换m的值:

x = df.loc[:, pd.IndexSlice[:,'p']].notna().replace({False:float('nan')}).values
df.loc[:, pd.IndexSlice[:,'m']] *= x

       n1        n2     
     p    m    p    m
0  1.0    2  3.0    4
1  NaN  NaN  7.0    8
2  NaN  NaN  NaN  NaN

You can stack and unstack the transposed dataframe to be able to easily select and change values, and then again stack, unstack and transpose to get it back: 您可以对转置后的数据帧进行堆栈和拆栈,以便能够轻松地选择和更改值,然后再次进行堆栈,拆栈和转置以将其取回:

df = df.T.stack(dropna=False).unstack(level=1)
df.loc[df['p'].isna(), 'm'] = np.nan

df = df.stack(dropna=False).unstack(1).T

After first line, df is: 在第一行之后, df为:

         m    p
n1 0   2.0  1.0
   1   6.0  NaN
   2  10.0  NaN
n2 0   4.0  3.0
   1   8.0  7.0
   2  12.0  NaN

And after last: 之后:

    n1        n2     
     m    p    m    p
0  2.0  1.0  4.0  3.0
1  NaN  NaN  8.0  7.0
2  NaN  NaN  NaN  NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM