简体   繁体   English

将数据帧行与numpy数组相乘

[英]Multiplying Dataframe rows with numpy array

I have a DataFrame that looks like this: 我有一个看起来像这样的DataFrame:

         Date   Last  portfolioID FinancialInstrument
1   2018-03-28  64.67            1                 Oil
2   2018-03-29  64.91            1                 Oil
3   2018-04-02  62.85            1                 Oil
4   2018-04-03  63.57            1                 Oil
5   2018-04-04  63.56            1                 Oil
6   2018-04-05  63.73            1                 Oil
7   2018-04-06  61.93            1                 Oil
8   2018-03-23  65.74            3                 Oil
9   2018-03-26  65.49            3                 Oil
10  2018-03-27  64.67            3                 Oil
11  2018-03-28  64.67            3                 Oil
12  2018-03-29  64.91            3                 Oil
13  2018-04-02  62.85            3                 Oil
14  2018-04-03  63.57            3                 Oil
15  2018-04-04  63.56            3                 Oil
16  2018-04-05  63.73            3                 Oil
17  2018-04-06  61.93            3                 Oil
18  2018-04-02  62.85            5                 Oil
19  2018-04-03  63.57            5                 Oil
20  2018-04-04  63.56            5                 Oil
21  2018-04-05  63.73            5                 Oil
22  2018-04-06  61.93            5                 Oil

and a NumPy array that looks like this: 和一个看起来像这样的NumPy数组:

[ 152.69506795   76.05719501  127.28719173]

I am grouping the DataFrame using the portfolioID where the first grouping correlates with the first value in the NumPy array and second group with second value in the NumPy array, etc. My question is, is there a way I can multiply the Last column in the DataFrame with its corresponding NumPy array value? 我使用portfolioID对DataFrame进行分组,其中第一个分组与NumPy数组中的第一个值相关,第二个组与NumPy数组中的第二个值相关联等等。我的问题是,有没有办法可以将Last列中的Last列相乘DataFrame及其对应的NumPy数组值?

This is what I have but I get an error stating "Length must be equal." 这就是我所拥有的,但我得到一个错误,指出“长度必须相等”。 shares is the NumPy array: shares是NumPy数组:

for pid, group in data.groupby('portfolioID'):
    lastCol = group.Last
    clumN = lastCol.multiply(shares, axis=0)

You can use pandas.Series.factorize to get the indices into your value array, and use these indices to get an appropriate array to multiply by. 您可以使用pandas.Series.factorize将索引输入值数组,并使用这些索引获取适当的数组乘以。

val_arr = np.array([152.69506795, 76.05719501, 127.28719173])

df.Last * val_arr[df.portfolioID.factorize()[0]]

# 1     9874.790044
# 2     9911.436861
# 3     9596.885021
# 4     9706.825470
# 5     9705.298519
# 6     9731.256680
# 7     9456.405558
# 8     5000.000000
# 9     4980.985701
# 10    4918.618801
# 11    4918.618801
# 12    4936.872528
# 13    4780.194706
# 14    4834.955887
# 15    4834.195315
# 16    4847.125038
# 17    4710.222087
# 18    8000.000000
# 19    8091.646778
# 20    8090.373906
# 21    8112.012729
# 22    7882.895784
# Name: Last, dtype: float64

Count the occurrance of each group in the df with count and resize the second array, arr , with np.repeat . 使用count df中每个组的出现count并使用np.repeat调整第二个数组arr np.repeat

arr = np.array([152.69506795, 76.05719501, 127.28719173])
df.Last * np.repeat(arr, df.groupby("portfolioID")["Last"].count())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM