简体   繁体   中英

How can I multiply two dataframes with different column labels in pandas?

I'm trying to multiply (add/divide/etc.) two dataframes that have different column labels.

I'm sure this is possible, but what's the best way to do it? I've tried using rename to change the columns on one df first, but (1) I'd rather not do that and (2) my real data has a multiindex on the columns (where only one layer of the multiindex is differently labeled), and rename seems tricky for that case...

So to try and generalize my question, how can I get df1 * df2 using map to define the columns to multiply together?

df1 = pd.DataFrame([1,2,3], index=['1', '2', '3'], columns=['a', 'b', 'c'])
df2 = pd.DataFrame([4,5,6], index=['1', '2', '3'], columns=['d', 'e', 'f'])
map = {'a': 'e', 'b': 'd', 'c': 'f'}

df1 * df2 = ?

I was also troubled by this problem. It seems that the pandas requires matrix multiply needs both dataframes has same column names.

I searched a lot and found the example in the setting enlargement is add one column to the dataframe.

For your question,

rs = pd.np.multiply(ds2, ds1)

The rs will have the same column names as ds2.

Suppose we want to multiply several columns with other serveral columns in the same dataframe and append these results into the original dataframe.

For example ds1,ds2 are in the same dataframe ds. We can

ds[['r1', 'r2', 'r3']] = pd.np.multiply(ds[['a', 'b', 'c']], ds[['d', 'e', 'f']])

I hope these will help.

Updated solution now that pd.np is being deprecated: df1.multiply(np.array(df2)

It will keep the column names of df1 and multiply them by the columns of df2 in order

I just stumbled onto the same problem. It seems like pandas wants both the column and row index to be aligned to do the element-wise multiplication, so you can just rename with your mapping during the multiplication:

>>> df1 = pd.DataFrame([[1,2,3]], index=['1', '2', '3'], columns=['a', 'b', 'c'])
>>> df2 = pd.DataFrame([[4,5,6]], index=['1', '2', '3'], columns=['d', 'e', 'f'])
>>> df1
   a  b  c
1  1  2  3
2  1  2  3
3  1  2  3
>>> df2
   d  e  f
1  4  5  6
2  4  5  6
3  4  5  6
>>> mapping = {'a' : 'e', 'b' : 'd', 'c' : 'f'}
>>> df1.rename(columns=mapping) * df2
   d  e   f
1  8  5  18
2  8  5  18
3  8  5  18

If you want the 'natural' order of columns, you can create a mapping on the fly like:

>>> df1 * df2.rename(columns=dict(zip(df2.columns, df1.columns)))

for example to do the "Frobenius inner product" of the two matrices, you could do:

>>> (df1 * df2.rename(columns=dict(zip(df2.columns, df1.columns)))).sum().sum()
96

This is a pretty old question, and as nnsk said, pd.np is being deprecated.

A nice looking solution is df1 * df2.values . This will produce the element-wise product of the two dataframes, and keep the column names of df1 .

Assuming the index is already aligned, you probably just want to align the columns in both DataFrame in the right order and divide the .values of both DataFrames.

Supposed mapping = {'a' : 'e', 'b' : 'd', 'c' : 'f'} :

v1 = df1.reindex(columns=['a', 'b', 'c']).values
v2 = df2.reindex(columns=['e', 'd', 'f']).values
rs = DataFrame(v1 / v2, index=v1.index, columns=['a', 'b', 'c'])

另一个假设索引和列定位良好的解决方案:

df_mul= pd.DataFrame(df1.values * df2.values, columns= df1.columns, index= df1.index)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM