[英]Multiplying Dataframe rows with numpy array
I have a DataFrame that looks like this: 我有一个看起来像这样的DataFrame:
Date Last portfolioID FinancialInstrument
1 2018-03-28 64.67 1 Oil
2 2018-03-29 64.91 1 Oil
3 2018-04-02 62.85 1 Oil
4 2018-04-03 63.57 1 Oil
5 2018-04-04 63.56 1 Oil
6 2018-04-05 63.73 1 Oil
7 2018-04-06 61.93 1 Oil
8 2018-03-23 65.74 3 Oil
9 2018-03-26 65.49 3 Oil
10 2018-03-27 64.67 3 Oil
11 2018-03-28 64.67 3 Oil
12 2018-03-29 64.91 3 Oil
13 2018-04-02 62.85 3 Oil
14 2018-04-03 63.57 3 Oil
15 2018-04-04 63.56 3 Oil
16 2018-04-05 63.73 3 Oil
17 2018-04-06 61.93 3 Oil
18 2018-04-02 62.85 5 Oil
19 2018-04-03 63.57 5 Oil
20 2018-04-04 63.56 5 Oil
21 2018-04-05 63.73 5 Oil
22 2018-04-06 61.93 5 Oil
and a NumPy array that looks like this: 和一个看起来像这样的NumPy数组:
[ 152.69506795 76.05719501 127.28719173]
I am grouping the DataFrame using the portfolioID
where the first grouping correlates with the first value in the NumPy array and second group with second value in the NumPy array, etc. My question is, is there a way I can multiply the Last
column in the DataFrame with its corresponding NumPy array value? 我使用portfolioID
对DataFrame进行分组,其中第一个分组与NumPy数组中的第一个值相关,第二个组与NumPy数组中的第二个值相关联等等。我的问题是,有没有办法可以将Last
列中的Last
列相乘DataFrame及其对应的NumPy数组值?
This is what I have but I get an error stating "Length must be equal." 这就是我所拥有的,但我得到一个错误,指出“长度必须相等”。 shares
is the NumPy array: shares
是NumPy数组:
for pid, group in data.groupby('portfolioID'):
lastCol = group.Last
clumN = lastCol.multiply(shares, axis=0)
You can use pandas.Series.factorize
to get the indices into your value array, and use these indices to get an appropriate array to multiply by. 您可以使用pandas.Series.factorize
将索引输入值数组,并使用这些索引获取适当的数组乘以。
val_arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * val_arr[df.portfolioID.factorize()[0]]
# 1 9874.790044
# 2 9911.436861
# 3 9596.885021
# 4 9706.825470
# 5 9705.298519
# 6 9731.256680
# 7 9456.405558
# 8 5000.000000
# 9 4980.985701
# 10 4918.618801
# 11 4918.618801
# 12 4936.872528
# 13 4780.194706
# 14 4834.955887
# 15 4834.195315
# 16 4847.125038
# 17 4710.222087
# 18 8000.000000
# 19 8091.646778
# 20 8090.373906
# 21 8112.012729
# 22 7882.895784
# Name: Last, dtype: float64
Count the occurrance of each group in the df with count
and resize the second array, arr
, with np.repeat
. 使用count
df中每个组的出现count
并使用np.repeat
调整第二个数组arr
np.repeat
。
arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * np.repeat(arr, df.groupby("portfolioID")["Last"].count())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.