简体   繁体   中英

How to convert a dataframe into a pandas IndexSlice to index another MultiIndex dataframe

I have the following 2 dataframes df1 and df2:

import pandas as pd

m_idx = pd.MultiIndex.from_product([range(3), range(3, 6), range(6, 8)])
m_idx.names = ['a', 'b', 'c']
df1 = pd.DataFrame(None, index=m_idx, columns=['x1', 'x2', 'x3'])
df1.loc[:, 'x1'] = m_idx.get_level_values('a') + m_idx.get_level_values('b') + m_idx.get_level_values('c')
df1.loc[:, 'x2'] = df1.loc[:, 'x1'] * 2
df1.loc[:, 'x3'] = df1.loc[:, 'x1'] * 3

df2 = pd.DataFrame({'a': [0, 2], 'c': [6, 6]})

df1:

       x1  x2  x3
a b c            
0 3 6   9  18  27
    7  10  20  30
  4 6  10  20  30
    7  11  22  33
  5 6  11  22  33
    7  12  24  36
1 3 6  10  20  30
    7  11  22  33
  4 6  11  22  33
    7  12  24  36
  5 6  12  24  36
    7  13  26  39
2 3 6  11  22  33
    7  12  24  36
  4 6  12  24  36
    7  13  26  39
  5 6  13  26  39
    7  14  28  42

df2:

   a  c
0  0  6
1  2  6

How can I convert df2 into something I can use to look up the index of df1 where the column names of df2 are the levels and in each row you have the combination of keys you are looking to get out of the df1 index.

Or in other words how can I convert df2 into something that does the equivalent of

df1.loc[pd.IndexSlice[[0, 2], :, [6, 6]], :]

which would return:

       x1  x2  x3
a b c            
0 3 6   9  18  27
  4 6  10  20  30
  5 6  11  22  33
2 3 6  11  22  33
  4 6  12  24  36
  5 6  13  26  39

This is a very simplified and small scale version of what I am actually trying to solve. So really looking to create the pd.IndexSlice on the fly.

I see a separate question that suggested this and I have done something similar in my code BUT it takes too long for my purposes.

df_list = [df1.loc[(v[0], slice(None), v[1]), :] for r, v in df2.iterrows()]
df_sliced = pd.concat(df_list)

So am hoping that using pd.IndexSlice or another alternative instead could be much quicker.

MANY THANKS!

Convert the index of df1 to dataframe then use isin + all on the matching levels to replicate the behaviour of index slice

d = df2.to_dict('list')
df1[df1.index.to_frame()[d].isin(d).all(1)]

       x1  x2  x3
a b c            
0 3 6   9  18  27
  4 6  10  20  30
  5 6  11  22  33
2 3 6  11  22  33
  4 6  12  24  36
  5 6  13  26  39

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM