简体   繁体   English

两个完全不相等的 pandas 数据帧的乘法

[英]Multiplication on two complete unequal pandas dataframes

I am wondering how to do this multiplication on two unequal pandas dataframes.我想知道如何在两个不相等的 pandas 数据帧上进行这种乘法运算。 I tried a way like df1.multiply(df2["sales"], axis="index") but it didn't succeed.我尝试了df1.multiply(df2["sales"], axis="index")之类的方法,但没有成功。 Any suggestion?有什么建议吗?

abc = {
    'Monday': {'apple': 0.62, 'orange': 0.37},
    'Tuesday': {'apple': 0.83, 'orange': 0.17},
    'Wednesday': {'apple': 0.40, 'orange': 0.60}
}

df1 = pd.DataFrame.from_dict(abc )


         Monday  Tuesday  Wednesday
apple  0.620000 0.830000   0.400000
orange 0.370000 0.170000   0.600000
efd = {
    'sales': {
        'Japan': -1,
        'US': -2,
        'UK': -3,
        'EU': -4,
        'SA': -5,
        'AUS': -6
    }
}

df2 = pd.DataFrame.from_dict(efd)

       sales
AUS       -6
EU        -4
Japan     -1
SA        -5
UK        -3
US        -2

Expected output:预期输出:

                Monday   Tuesday  Wednesday
apple_Japan  -0.620000 -0.830000  -0.400000
apple_US     -1.240000 -1.660000  -0.800000
apple_UK     -1.860000 -2.490000  -1.200000
apple_EU     -2.480000 -3.320000  -1.600000
apple_SA     -3.100000 -4.150000  -2.000000
apple_AUS    -3.720000 -4.980000  -2.400000
orange_Japan -0.370000 -0.170000  -0.600000
orange_US    -0.740000 -0.340000  -1.200000
orange_UK    -1.110000 -0.510000  -1.800000
orange_EU    -1.480000 -0.680000  -2.400000
orange_SA    -1.850000 -0.850000  -3.000000
orange_AUS   -1.940510 -1.020000  -3.600000

As @furas suggests this is not a normal multiplication, however the following approach will work:正如@furas 所暗示的,这不是正常的乘法,但是以下方法将起作用:

from collections import defaultdict
def combineFrames(products: pd.DataFrame, sales:pd.DataFrame) -> pd.DataFrame:
    rslt = defaultdict(list)
    cols = list(products.columns)
    for c in sales.index:
        for p in products.index:
            prod = f"{p}_{c}"
            rslt['Prod'].append(f"{p}_{c}")
            vals = products[products.index==p].values[0]
            for i in range(len(vals)):
                vals[i] = vals[i]*sales[sales.index== c].values[0][0]
            for idx, colhd in enumerate(cols):
                rslt[colhd].append(vals[idx])
    df = pd.DataFrame(rslt)
    df.set_index('Prod', drop=True, inplace=True)
    return df

executing combineFrames(df1, df2) will produce:执行combineFrames(df1, df2)将产生:

    Monday  Tuesday Wednesday
Prod            
apple_AUS   -3.72   -4.98   -2.4
orange_AUS  -2.22   -1.02   -3.6
apple_EU    -2.48   -3.32   -1.6
orange_EU   -1.48   -0.68   -2.4
apple_Japan -0.62   -0.83   -0.4
orange_Japan    -0.37   -0.17   -0.6
apple_SA    -3.10   -4.15   -2.0
orange_SA   -1.85   -0.85   -3.0
apple_UK    -1.86   -2.49   -1.2
orange_UK   -1.11   -0.51   -1.8
apple_US    -1.24   -1.66   -0.8
orange_US   -0.74   -0.34   -1.2

This produces the desired output, where sales are multiplied by the product day column.这会产生所需的输出,其中销售额乘以产品日列。

Use MultiIndex.from_product and then DataFrame.reindex for MultiIndex in both DataFrames, so possible multiple by DataFrame.mul :使用MultiIndex.from_product然后DataFrame.reindex在两个 DataFrames 中为 MultiIndex 使用,因此可以通过DataFrame.mul

mux = pd.MultiIndex.from_product([df1.index, df2.index])

df = df1.reindex(mux, level=0).mul(df2.reindex(mux, level=1)['sales'], axis=0)
df.index = [f'{a}_{b}' for a, b in df.index]
print (df)
              Monday  Tuesday  Wednesday
apple_AUS      -3.72    -4.98       -2.4
apple_EU       -2.48    -3.32       -1.6
apple_Japan    -0.62    -0.83       -0.4
apple_SA       -3.10    -4.15       -2.0
apple_UK       -1.86    -2.49       -1.2
apple_US       -1.24    -1.66       -0.8
orange_AUS     -2.22    -1.02       -3.6
orange_EU      -1.48    -0.68       -2.4
orange_Japan   -0.37    -0.17       -0.6
orange_SA      -1.85    -0.85       -3.0
orange_UK      -1.11    -0.51       -1.8
orange_US      -0.74    -0.34       -1.2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM