简体   繁体   中英

Pandas Multiindex subtract based on only two index level matchings

Say I have a Pandas multi-index data frame with 3 indices:

import pandas as pd
import numpy as np
arrays = [['UK', 'UK', 'US', 'FR'], ['Firm1', 'Firm1', 'Firm2', 'Firm1'], ['Andy', 'Peter', 'Peter', 'Andy']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm', 'Responsible'))
df_3idx = pd.DataFrame(np.random.randn(4,3), index = idx)
df_3idx
                                  0         1         2
Country Firm  Responsible                              
UK      Firm1 Andy         0.237655  2.049636  0.480805
              Peter        1.135344  0.745616 -0.577377
US      Firm2 Peter        0.034786 -0.278936  0.877142
FR      Firm1 Andy         0.048224  1.763329 -1.597279

I have furthermore another pd.dataframe consisting of unique combinations of multi-index-level 1 and 2 from the above data:

arrays = [['UK', 'US', 'FR'], ['Firm1', 'Firm2', 'Firm1']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm'))
df_2idx = pd.DataFrame(np.random.randn(3,1), index = idx)
df_2idx
                      0
Country Firm           
UK      Firm1 -0.103828
US      Firm2  0.096192
FR      Firm1 -0.686631

I want to subtract the values from df_3idx by the corresponding value in df_2idx , so, for instance, I want to subtract from every value of the first two rows the value -0.103828, as index 1 and 2 from both dataframes match.

Does anybody know how to do this? I figured I could simply unstack the first dataframe and then subtract, but I am getting an error message.

df_3idx.unstack('Responsible').sub(df_2idx, axis=0)

ValueError: cannot join with no overlapping index names

Unstacking might anyway not be a preferable solution as my data is very big and unstacking might take a lot of time.

I would appreciate any help. Many thanks in advance!

related question but not focused on MultiIndex

However, the answer doesn't really care. The sub method will align on the matching index levels.

pd.DataFrame.sub with parameter axis=0

df_3idx.sub(df_2idx[0], axis=0)

                                  0         1         2
Country Firm  Responsible                              
FR      Firm1 Andy         0.027800  3.316148  0.804833
UK      Firm1 Andy        -2.009797 -1.830799 -0.417737
              Peter       -1.174544  0.644006 -1.150073
US      Firm2 Peter       -2.211121 -3.825443 -4.391965

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM