[英]Pandas Multiindex subtract based on only two index level matchings
Say I have a Pandas multi-index data frame with 3 indices:假设我有一个包含 3 个索引的 Pandas 多索引数据框:
import pandas as pd
import numpy as np
arrays = [['UK', 'UK', 'US', 'FR'], ['Firm1', 'Firm1', 'Firm2', 'Firm1'], ['Andy', 'Peter', 'Peter', 'Andy']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm', 'Responsible'))
df_3idx = pd.DataFrame(np.random.randn(4,3), index = idx)
df_3idx
0 1 2
Country Firm Responsible
UK Firm1 Andy 0.237655 2.049636 0.480805
Peter 1.135344 0.745616 -0.577377
US Firm2 Peter 0.034786 -0.278936 0.877142
FR Firm1 Andy 0.048224 1.763329 -1.597279
I have furthermore another pd.dataframe consisting of unique combinations of multi-index-level 1 and 2 from the above data:我还有另一个 pd.dataframe,由上述数据中多索引级别 1 和 2 的独特组合组成:
arrays = [['UK', 'US', 'FR'], ['Firm1', 'Firm2', 'Firm1']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm'))
df_2idx = pd.DataFrame(np.random.randn(3,1), index = idx)
df_2idx
0
Country Firm
UK Firm1 -0.103828
US Firm2 0.096192
FR Firm1 -0.686631
I want to subtract the values from df_3idx
by the corresponding value in df_2idx
, so, for instance, I want to subtract from every value of the first two rows the value -0.103828, as index 1 and 2 from both dataframes match.我想用df_3idx
中的相应值减去df_2idx
中的值,因此,例如,我想从前两行的每个值中减去值 -0.103828,因为两个数据帧中的索引 1 和 2 都匹配。
Does anybody know how to do this?有人知道怎么做这个吗? I figured I could simply unstack the first dataframe and then subtract, but I am getting an error message.我想我可以简单地拆开第一个 dataframe 然后减去,但我收到一条错误消息。
df_3idx.unstack('Responsible').sub(df_2idx, axis=0)
ValueError: cannot join with no overlapping index names
Unstacking might anyway not be a preferable solution as my data is very big and unstacking might take a lot of time.无论如何,取消堆叠可能不是一个更好的解决方案,因为我的数据非常大并且取消堆叠可能需要很多时间。
I would appreciate any help.我将不胜感激任何帮助。 Many thanks in advance!提前谢谢了!
related question but not focused on MultiIndex
相关问题但不关注MultiIndex
However, the answer doesn't really care.但是,答案并不真正在乎。 The sub
method will align on the matching index levels. sub
方法将对齐匹配的索引级别。
pd.DataFrame.sub
with parameter axis=0
pd.DataFrame.sub
参数axis=0
df_3idx.sub(df_2idx[0], axis=0)
0 1 2
Country Firm Responsible
FR Firm1 Andy 0.027800 3.316148 0.804833
UK Firm1 Andy -2.009797 -1.830799 -0.417737
Peter -1.174544 0.644006 -1.150073
US Firm2 Peter -2.211121 -3.825443 -4.391965
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.