简体   繁体   English

广播两个pandas DataFrames的乘法

[英]Broadcasting multiplication of two pandas DataFrames

I have two DataFrames, for example: 我有两个DataFrame,例如:

df1 = pn.DataFrame(np.arange(6).reshape(3, 2), columns=['A1', 'B1'])
df2 = pn.DataFrame(np.arange(1,7).reshape(3, 2), columns=['A2', 'B2'])

  A1 B1
0  0  1
1  2  3
2  4  5

  A2 B2
0  1  2
1  3  4
2  5  6

I need multiply df1 and df2 by columns to get a DataFrame with following result: 我需要按列乘以df1和df2以获得具有以下结果的DataFrame:

  A1*A2  A1*B2  B1*A2  B1*B2
0     0      0      1      2
1     6      8      9     12
2    20     24     25     30

Sizes of df1 and df2 in real task are (1000 columns x 90 000 rows). 实际任务中df1和df2的大小为(1000列×90 000行)。

I don't want to use double "for" cycle across columns of these DataFrames. 我不想在这些DataFrame的列之间使用双“for”循环。

Is there a built-in function or some easy way to calculate it? 是否有内置函数或一些简单的计算方法?

You can use df.multiply() to multiply df with a series and then concat the resulting dataframes like this: 您可以使用df.multiply()将df与一个系列相乘,然后将结果数据帧连接起来,如下所示:

df3 = pd.concat([df1[["A1", "B1"]].multiply(df2["A2"], axis="index"),
df1[["A1", "B1"]].multiply(df2["B2"], axis="index")], axis = 1)

df3.columns = ['A1*A2', "B1*A2", "A1*B2", "B1*B2"]

You get: 你得到:

     A1*A2  B1*A2   A1*B2   B1*B2
0     0      1      0       2
1     6      9      8       12
2     20     25     24      30

Use broadcasting for efficient performance gain: 使用broadcasting获得有效的性能提升:

import itertools

df = pd.DataFrame((df1.values[..., None] * df2.values[:, None]).reshape(df1.shape[0],-1))
df.columns = ["*".join(i) for i in itertools.product(*[df1.columns, df2.columns])]

在此输入图像描述


The purpose of incorporating df1.values[..., None] is to create an extra dimension to the right having shape (3, 2, 1) from earlier (3, 2) shape of df1.values . 合并df1.values[..., None]的目的是为df1.values[..., None]早期(3, 2)形状创建一个具有形状(3, 2, 1) df1.values (3, 2)的右边的额外维度。

Furthermore, df2.values[:, None] adds an extra dimension towards the center axis so that it's shape becomes (3, 1, 2) from initial (3,2) to aid in the multiplication process. 此外, df2.values[:, None] ,以便它的形状变得增加朝向中心轴线一个额外的维度(3, 1, 2)从初始(3,2)在乘法处理助剂。

Finally, reshape them to take on the same number of rows as that of the original df1 (or) df2 最后, reshape它们以获得与原始df1 (或) df2相同的行数
( since both share the same shape in the question mentioned ). 因为在提到的问题中两者具有相同的形状 )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM