[英]Create column from multiple dataframes
I need to create some new columns based on the value of a dataframe
filed and a look up dataframe
with some rates. 我需要根据提交的
dataframe
的值和具有一定比率的查找dataframe
创建一些新列。
Having df1
as 具有
df1
作为
zone hh hhind
0 14 112.0 3.4
1 15 5.0 4.4
2 16 0.0 1.0
and a look_up
df as 和一个
look_up
df为
ind per1 per2 per3 per4
0 1.0 1.000 0.000 0.000 0.000
24 3.4 0.145 0.233 0.165 0.457
34 4.4 0.060 0.114 0.075 0.751
how can i update df1.hh1
by multiplying the look_up.per1
based on df1.hhind
and lookup.ind
我怎样才能更新
df1.hh1
由乘以look_up.per1
基于df1.hhind
和lookup.ind
zone hh hhind hh1
0 14 112.0 3.4 16.240
1 15 5.0 4.4 0.300
2 16 0.0 1.0 0.000
at the moment im getting the result by merging the tables and the doing the arithmetic. 目前,我通过合并表格和进行算术来获得结果。
r = pd.merge(df1, look_up, left_on="hhind", right_on="ind")
r["hh1"] = r.hh *r.per1
i'd like to know if there is a more straight way to accomplish this by not merging the tables? 我想知道是否有一种更直接的方式来完成此工作,而不合并表?
You could first set hhind and ind as the index axis of df1
and look_up
dataframes respectively. 您可以首先将hhind和ind分别设置为
df1
和look_up
数据帧的索引轴。 Then, multiply corresponding elements in hh and per1 element-wise. 然后,将相应的元素分别乘以hh和per1逐个元素。
Map these results to the column hhind and assign these to a new column later as shown: 将这些结果映射到后面的列,然后将它们分配给新列,如下所示:
mapper = df1.set_index('hhind')['hh'].mul(look_up.set_index('ind')['per1'])
df1.assign(hh1=df1['hhind'].map(mapper))
另一个解决方案:
df1['hh1'] = (df1['hhind'].map(lambda x: look_up[look_up["ind"]==x]["per1"])) * df1['hh']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.