[英]Multiply all columns of a multi-indexed DataFrame by appropriate values in a Series
I feel like this one should be obvious, but I'm a bit stuck. 我觉得这个应该是显而易见的,但我有点卡住了。
I have a DataFrame ( df
) with a 3-level MultiIndex on the rows. 我有一个DataFrame(
df
),行上有3级MultiIndex。 One of the levels of the MultiIndex is ccy
and represents the currency that denominates the information contained in that row. 其中一个多指标的水平是
ccy
并表示denominates包含在该行中的信息货币。 Each row has 3 columns of data. 每行有3列数据。
I would like to convert all of the data to be denominated in a reference currency (say USD). 我想将所有数据转换为以参考货币(比如美元)计价。 To do this, I have a series (
forex
) that contains foreign exchange rates for the relevant currencies. 为此,我有一系列(
forex
)包含相关货币的外汇汇率。
So the goal is simple: multiply all the data in each row of df
by the value of forex
that corresponds to the ccy
entry of the index of that row in df
. 所以目标很简单:乘中的每一行中的所有数据
df
由价值forex
对应于ccy
在该行的索引进入df
。
The mechanical setup looks like this: 机械设置如下所示:
import pandas as pd
import numpy as np
import itertools
np.random.seed(0)
tuples = list(itertools.product(
list('abd'),
['one', 'two', 'three'],
['USD', 'EUR', 'GBP']
))
np.random.shuffle(tuples)
idx = pd.MultiIndex.from_tuples(tuples[:-10], names=['letter', 'number', 'ccy'])
df = pd.DataFrame(np.random.randn(len(idx), 3), index=idx,
columns=['val_1', 'val_2', 'val_3'])
forex = pd.Series({'USD': 1.0,
'EUR': 1.3,
'GBP': 1.7})
I can get what I need by running: 我可以通过运行得到我需要的东西:
df.apply(lambda col: col.mul(forex, level='ccy'), axis=0)
But it seems weird to me that I would need to use pd.DataFrame.apply
in such a simple case. 但对我来说似乎很奇怪,我需要在这么简单的情况下使用
pd.DataFrame.apply
。 I would have expected the following syntax (or something very much like it) to work: 我希望以下语法(或非常类似的东西)能够工作:
df.mul(forex, level='ccy', axis=0)
but that gives me: 但这给了我:
ValueError: cannot reindex from a duplicate axis
Clearly the apply
method isn't a disaster. 显然,
apply
方法不是灾难。 But just seems weird that I couldn't figure out the syntax for doing this directly across all the columns with mul
. 但似乎很奇怪,我无法找到使用
mul
直接在所有列中执行此操作的语法。 Is there a more direct way to handle this? 有没有更直接的方法来处理这个? If not, is there an intuitive reason the
mul
syntax shouldn't be enhanced to work this way? 如果没有,是否有一个直观的原因
mul
语法不应该以这种方式增强?
This now works in master/0.14. 这现在在master / 0.14中工作。 See the issue: https://github.com/pydata/pandas/pull/6682
请参阅问题: https : //github.com/pydata/pandas/pull/6682
In [11]: df.mul(forex,level='ccy',axis=0)
Out[11]:
val_1 val_2 val_3
letter number ccy
a one GBP -2.172854 2.443530 -0.132098
d three USD 1.089630 0.096543 1.418667
b two GBP 1.986064 1.610216 1.845328
three GBP 4.049782 -0.690240 0.452957
a two GBP -2.304713 -0.193974 -1.435192
b one GBP 1.199589 -0.677936 -1.406234
d two GBP -0.706766 -0.891671 1.382272
b two EUR -0.298026 2.810233 -1.244011
d one EUR 0.087504 0.268448 -0.593946
GBP -1.801959 1.045427 2.430423
b three EUR -0.275538 -0.104438 0.527017
a one EUR 0.154189 1.630738 1.844833
b one EUR -0.967013 -3.272668 -1.959225
d three GBP 1.953429 -2.029083 1.939772
EUR 1.962279 1.388108 -0.892566
a three GBP 0.025285 -0.638632 -0.064980
USD 0.367974 -0.044724 -0.302375
[17 rows x 3 columns]
Here is a another way to do it (also requires master/0.14) 这是另一种方法(也需要master / 0.14)
In [127]: df = df.sortlevel()
In [128]: df
Out[128]:
val_1 val_2 val_3
letter number ccy
a one EUR 0.118607 1.254414 1.419102
GBP -1.278149 1.437371 -0.077705
three GBP 0.014873 -0.375666 -0.038224
USD 0.367974 -0.044724 -0.302375
two GBP -1.355714 -0.114103 -0.844231
b one EUR -0.743856 -2.517437 -1.507096
GBP 0.705641 -0.398786 -0.827197
three EUR -0.211952 -0.080337 0.405398
GBP 2.382224 -0.406024 0.266445
two EUR -0.229251 2.161717 -0.956931
GBP 1.168273 0.947186 1.085487
d one EUR 0.067311 0.206499 -0.456881
GBP -1.059976 0.614957 1.429661
three EUR 1.509445 1.067775 -0.686589
GBP 1.149076 -1.193578 1.141042
USD 1.089630 0.096543 1.418667
two GBP -0.415745 -0.524512 0.813101
[17 rows x 3 columns]
idx = pd.IndexSlice
In [129]: pd.concat([ df.loc[idx[:,:,x],:]*v for x,v in forex.iteritems() ])
Out[129]:
val_1 val_2 val_3
letter number ccy
a one EUR 0.154189 1.630738 1.844833
b one EUR -0.967013 -3.272668 -1.959225
three EUR -0.275538 -0.104438 0.527017
two EUR -0.298026 2.810233 -1.244011
d one EUR 0.087504 0.268448 -0.593946
three EUR 1.962279 1.388108 -0.892566
a one GBP -2.172854 2.443530 -0.132098
three GBP 0.025285 -0.638632 -0.064980
two GBP -2.304713 -0.193974 -1.435192
b one GBP 1.199589 -0.677936 -1.406234
three GBP 4.049782 -0.690240 0.452957
two GBP 1.986064 1.610216 1.845328
d one GBP -1.801959 1.045427 2.430423
three GBP 1.953429 -2.029083 1.939772
two GBP -0.706766 -0.891671 1.382272
a three USD 0.367974 -0.044724 -0.302375
d three USD 1.089630 0.096543 1.418667
[17 rows x 3 columns]
Here's another way via merging 这是通过合并的另一种方式
In [36]: f = forex.to_frame('value')
In [37]: f.index.name = 'ccy'
In [38]: pd.merge(df.reset_index(),f.reset_index(),on='ccy')
Out[38]:
letter number ccy val_1 val_2 val_3 value
0 a one GBP -1.278149 1.437371 -0.077705 1.7
1 b two GBP 1.168273 0.947186 1.085487 1.7
2 b three GBP 2.382224 -0.406024 0.266445 1.7
3 a two GBP -1.355714 -0.114103 -0.844231 1.7
4 b one GBP 0.705641 -0.398786 -0.827197 1.7
5 d two GBP -0.415745 -0.524512 0.813101 1.7
6 d one GBP -1.059976 0.614957 1.429661 1.7
7 d three GBP 1.149076 -1.193578 1.141042 1.7
8 a three GBP 0.014873 -0.375666 -0.038224 1.7
9 d three USD 1.089630 0.096543 1.418667 1.0
10 a three USD 0.367974 -0.044724 -0.302375 1.0
11 b two EUR -0.229251 2.161717 -0.956931 1.3
12 d one EUR 0.067311 0.206499 -0.456881 1.3
13 b three EUR -0.211952 -0.080337 0.405398 1.3
14 a one EUR 0.118607 1.254414 1.419102 1.3
15 b one EUR -0.743856 -2.517437 -1.507096 1.3
16 d three EUR 1.509445 1.067775 -0.686589 1.3
[17 rows x 7 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.