简体   繁体   English

Pandas:两个数据帧的行乘法

[英]Pandas: rowwise multiplication of two dataframes

I have two dataframes;我有两个数据框; A contains allocation fractions and B contains hourly volumes. A 包含分配分数,B 包含每小时量。 To get the right volume for each bus for a given hour, I need to multiply A with each row of dataframe B. For a given hour (x), this would be a simple multiplication of A * B.loc[x] .为了在给定小时内为每辆公共汽车获得正确的音量,我需要将 A 与 dataframe B 的每一行相乘。对于给定的小时 (x),这将是A * B.loc[x]的简单乘法。

A =       col_a  col_b  col_c   col_d   col_e               
     0    0.0    0.0    0.0     0.0     1.0
     1    0.0    0.0    1.0     0.0     0.0
     2    0.0    1.0    0.0     0.5     0.0
     3    0.5    0.0    0.0     0.5     0.0
     4    0.5    0.0    0.0     0.0     0.0
B =     col_a   col_b   col_c   col_d   col_e
    0   12881   598     154     180     0.0 
    1   12881   680     154     180     0.0
    2   11617   806     154     180     0.0
    3   12991   100     154     180     0.0

However, I want to do this multiplication for each hour at once and create a large multiindex dataframe C.但是,我想一次对每个小时进行一次乘法运算并创建一个大型多索引 dataframe C。

C =          col_a   col_b  col_c  col_d  col_e
    hr  bus                     
    0   0    0.0     0.0    0.0    0.0   0.0
        1    0.0     0.0    154.0  0.0   0.0
        2    0.0     598.0  0.0    90.0  0.0
        3    6440.5  0.0    0.0    90.0  0.0
        4    6440.5  0.0    0.0    0.0   0.0
    1   0    0.0     0.0    0.0    0.0   0.0
        1    0.0     0.0    154.0  0.0   0.0
        2    0.0     680.0  0.0    90.0  0.0

I managed to create this dataframe with a list operation and overwriting the index of the resulting dataframe.我设法通过列表操作创建了这个 dataframe 并覆盖了生成的 dataframe 的索引。 I would not consider this a very good practice and wonder if there's a better approach that doesn't require an overwrite of the index?我不会认为这是一个很好的做法,并想知道是否有更好的方法不需要覆盖索引?

dfs = [A.mul(B.loc[i]) for i in B.index]
C = pandas.concat(dfs)

C.index = pandas.MultiIndex.from_product([B.index, A.index], names=['hr', 'bus'])

First "replicate" B DataFrame, reformatting the index:首先“复制” B DataFrame,重新格式化索引:

BB = pd.DataFrame(np.repeat(B.values, A.index.size, axis=0), columns=B.columns,
    index=pd.MultiIndex.from_product((B.index, A.index), names=['hr', 'bus']))

Then compute the result:然后计算结果:

result = A.mul(BB, level=1)

The result is:结果是:

         col_a  col_b  col_c  col_d  col_e
hr bus                                    
0  0       0.0    0.0    0.0    0.0    0.0
   1       0.0    0.0  154.0    0.0    0.0
   2       0.0  598.0    0.0   90.0    0.0
   3    6440.5    0.0    0.0   90.0    0.0
   4    6440.5    0.0    0.0    0.0    0.0
1  0       0.0    0.0    0.0    0.0    0.0
   1       0.0    0.0  154.0    0.0    0.0
   2       0.0  680.0    0.0   90.0    0.0
   3    6440.5    0.0    0.0   90.0    0.0
   4    6440.5    0.0    0.0    0.0    0.0
2  0       0.0    0.0    0.0    0.0    0.0
   1       0.0    0.0  154.0    0.0    0.0
   2       0.0  806.0    0.0   90.0    0.0
   3    5808.5    0.0    0.0   90.0    0.0
   4    5808.5    0.0    0.0    0.0    0.0
3  0       0.0    0.0    0.0    0.0    0.0
   1       0.0    0.0  154.0    0.0    0.0
   2       0.0  100.0    0.0   90.0    0.0
   3    6495.5    0.0    0.0   90.0    0.0
   4    6495.5    0.0    0.0    0.0    0.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM