[英]Pandas: Elementwise multiplication of two dataframes
I know how to do element by element multiplication between two Pandas dataframes. 我知道如何在两个Pandas数据帧之间进行逐元素乘法。 However, things get more complicated when the dimensions of the two dataframes are not compatible. 但是,当两个数据帧的尺寸不兼容时,事情变得更加复杂。 For instance below df * df2
is straightforward, but df * df3
is a problem: 例如, df * df2
下面很简单,但是df * df3
是一个问题:
df = pd.DataFrame({'col1' : [1.0] * 5,
'col2' : [2.0] * 5,
'col3' : [3.0] * 5 }, index = range(1,6),)
df2 = pd.DataFrame({'col1' : [10.0] * 5,
'col2' : [100.0] * 5,
'col3' : [1000.0] * 5 }, index = range(1,6),)
df3 = pd.DataFrame({'col1' : [0.1] * 5}, index = range(1,6),)
df.mul(df2, 1) # element by element multiplication no problems
df.mul(df3, 1) # df(row*col) is not equal to df3(row*col)
col1 col2 col3
1 0.1 NaN NaN
2 0.1 NaN NaN
3 0.1 NaN NaN
4 0.1 NaN NaN
5 0.1 NaN NaN
In the above situation, how can I multiply every column of df with df3.col1 ? 在上面的情况下, 我如何将每列df与df3.col1相乘 ?
My attempt: I tried to replicate df3.col1
len(df.columns.values)
times to get a dataframe that is of the same dimension as df
: 我的尝试:我尝试复制df3.col1
len(df.columns.values)
次,以获得与df
具有相同维度的数据帧:
df3 = pd.DataFrame([df3.col1 for n in range(len(df.columns.values)) ])
df3
1 2 3 4 5
col1 0.1 0.1 0.1 0.1 0.1
col1 0.1 0.1 0.1 0.1 0.1
col1 0.1 0.1 0.1 0.1 0.1
But this creates a dataframe of dimensions 3 * 5, whereas I am after 5*3. 但这会创建一个尺寸为3 * 5的数据框,而我的数据框则为5 * 3。 I know I can take the transpose with df3.T()
to get what I need but I think this is not that the fastest way. 我知道我可以用df3.T()
进行转置以得到我需要的东西,但我认为这不是最快的方法。
In [161]: pd.DataFrame(df.values*df2.values, columns=df.columns, index=df.index)
Out[161]:
col1 col2 col3
1 10 200 3000
2 10 200 3000
3 10 200 3000
4 10 200 3000
5 10 200 3000
A simpler way to do this is just to multiply the dataframe whose colnames you want to keep with the values (ie numpy array) of the other, like so: 一种更简单的方法就是将要保留其colnames的数据帧与另一个的值(即numpy数组)相乘,如下所示:
In [63]: df * df2.values
Out[63]:
col1 col2 col3
1 10 200 3000
2 10 200 3000
3 10 200 3000
4 10 200 3000
5 10 200 3000
This way you do not have to write all that new dataframe boilerplate. 这样您就不必编写所有新的数据框样板文件。
This works for me: 这对我有用:
mul = df.mul(df3.c, axis=0)
Or, when you want to subtract (divide) instead: 或者,当您想要减去(除)时:
sub = df.sub(df3.c, axis=0)
div = df.div(df3.c, axis=0)
Works also with a nan
in df (eg if you apply this to the df: df.iloc[0]['col2'] = np.nan)
也可以使用df中的nan
(例如,如果将其应用于df: df.iloc[0]['col2'] = np.nan)
要使用Pandas广播属性,您可以使用multiply
。
df.multiply(df3['col1'], axis=0)
Another way is create list of columns and join them: 另一种方法是创建列列表并加入它们:
cols = [pd.DataFrame(df[col] * df3.col1, columns=[col]) for col in df]
mul = cols[0].join(cols[1:])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.