[英]How to multiply two dataframes of different shapes
I have two dataframes:我有两个数据框:
the first datframe df1
looks like this:第一个 datframe
df1
如下所示:
variable value
0 plastic 5774
2 glass 42
4 ferrous metal 642
6 non-ferrous metal 14000
8 paper 4000
Here is the head of the second dataframe df2
:这是第二个 dataframe
df2
的头部:
waste_type total_waste_recycled_tonne year energy_saved
non-ferrous metal 160400.0 2015 NaN
glass 14600.0 2015 NaN
ferrous metal 15200 2015 NaN
plastic 766800 2015 NaN
I want to update the energy_saved
in the second dataframe df2
such that I multiply the total_waste_recycled_tonne
variable in df2 by the variable in df1 into the energy_saved
column in df2.我想更新第二个
energy_saved
df2
中的 energy_saved,以便将 df2 中的total_waste_recycled_tonne
变量乘以 df1 中的变量到 df2 中的energy_saved
列。
For example:例如:
For plastic: 5774 will be multipled with every waste_type
platic
with the total_waste_recycled_tonne
variable in df2对于塑料: 5774 将与 df2 中的
platic
变量的每个waste_type
塑料total_waste_recycled_tonne
ie: energy_saved
= 5774 * 766800即:
energy_saved
= 5774 * 766800
Here is what I tried:这是我尝试过的:
df2["energy_saved"] = df1[df1["variable"]=="plastic"]["value"].values[0] * df2["total_waste_recycled_tonne"][df2["waste_type"]=="plastic"]
However the problem was that when I do others, the rest changes back to NaN
.但是问题是,当我做其他事情时, rest 变回
NaN
。 I need a better approach to handle this?我需要更好的方法来处理这个问题吗?
Use map
:使用
map
:
df2['energy_saved'] = (df2['waste_type'].map(df1.set_index('variable')['value'])
.mul(df2['total_waste_recycled_tonne']
)
Try via merge()
and pass how='right'
:通过
merge()
尝试并通过how='right'
:
df=df1[['variable','value']].merge(df2[['waste_type','total_waste_recycled_tonne']],left_on='variable',right_on='waste_type',how='right')
Finally:最后:
df2["energy_saved"]=df['value'].mul(df['total_waste_recycled_tonne'])
Output of df2
: df2
的 Output :
waste_type total_waste_recycled_tonne year energy_saved
0 non-ferrous metal 160400.0 2015 2.245600e+09
1 glass 14600.0 2015 6.132000e+05
2 ferrous metal 15200.0 2015 9.758400e+06
3 plastic 766800.0 2015 4.427503e+09
4 plastic 762700.0 2015 4.403830e+09
A set_index
+ reindex
option:一个
set_index
+ reindex
选项:
df2['energy_saved'] = (
df1.set_index('variable').reindex(df2['waste_type'])['value'] *
df2.set_index('waste_type')['total_waste_recycled_tonne']
).values
df2
: df2
:
waste_type total_waste_recycled_tonne year energy_saved
0 non-ferrous metal 160400.0 2015 2.245600e+09
1 glass 14600.0 2015 6.132000e+05
2 ferrous metal 15200.0 2015 9.758400e+06
3 plastic 766800.0 2015 4.427503e+09
4 plastic 762700.0 2015 4.403830e+09
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.