简体   繁体   English

如何将两个不同形状的数据框相乘

[英]How to multiply two dataframes of different shapes

I have two dataframes:我有两个数据框:

the first datframe df1 looks like this:第一个 datframe df1如下所示:

    variable            value
0   plastic             5774
2   glass               42
4   ferrous metal       642
6   non-ferrous metal   14000
8   paper               4000

Here is the head of the second dataframe df2 :这是第二个 dataframe df2的头部:

waste_type           total_waste_recycled_tonne   year   energy_saved
non-ferrous metal    160400.0                     2015    NaN
glass                14600.0                      2015    NaN
ferrous metal        15200                        2015    NaN
plastic              766800                       2015    NaN

I want to update the energy_saved in the second dataframe df2 such that I multiply the total_waste_recycled_tonne variable in df2 by the variable in df1 into the energy_saved column in df2.我想更新第二个energy_saved df2中的 energy_saved,以便将 df2 中的total_waste_recycled_tonne变量乘以 df1 中的变量到 df2 中的energy_saved列。

For example:例如:

For plastic: 5774 will be multipled with every waste_type platic with the total_waste_recycled_tonne variable in df2对于塑料: 5774 将与 df2 中的platic变量的每个waste_type塑料total_waste_recycled_tonne

ie: energy_saved = 5774 * 766800即: energy_saved = 5774 * 766800

Here is what I tried:这是我尝试过的:

df2["energy_saved"] = df1[df1["variable"]=="plastic"]["value"].values[0] * df2["total_waste_recycled_tonne"][df2["waste_type"]=="plastic"]

However the problem was that when I do others, the rest changes back to NaN .但是问题是,当我做其他事情时, rest 变回NaN I need a better approach to handle this?我需要更好的方法来处理这个问题吗?

Use map :使用map

df2['energy_saved'] = (df2['waste_type'].map(df1.set_index('variable')['value'])
                          .mul(df2['total_waste_recycled_tonne']
                      )

Try via merge() and pass how='right' :通过merge()尝试并通过how='right'

df=df1[['variable','value']].merge(df2[['waste_type','total_waste_recycled_tonne']],left_on='variable',right_on='waste_type',how='right')

Finally:最后:

df2["energy_saved"]=df['value'].mul(df['total_waste_recycled_tonne'])

Output of df2 : df2的 Output :

    waste_type          total_waste_recycled_tonne  year    energy_saved
0   non-ferrous metal   160400.0                    2015    2.245600e+09
1   glass               14600.0                     2015    6.132000e+05
2   ferrous metal       15200.0                     2015    9.758400e+06
3   plastic             766800.0                    2015    4.427503e+09
4   plastic             762700.0                    2015    4.403830e+09

A set_index + reindex option:一个set_index + reindex选项:

df2['energy_saved'] = (
        df1.set_index('variable').reindex(df2['waste_type'])['value'] *
        df2.set_index('waste_type')['total_waste_recycled_tonne']
).values

df2 : df2

          waste_type  total_waste_recycled_tonne  year  energy_saved
0  non-ferrous metal                    160400.0  2015  2.245600e+09
1              glass                     14600.0  2015  6.132000e+05
2      ferrous metal                     15200.0  2015  9.758400e+06
3            plastic                    766800.0  2015  4.427503e+09
4            plastic                    762700.0  2015  4.403830e+09

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM