[英]Pandas : Updating multiple column in a dataframe based on values from another dataframe
I have two dataframes of different dimensions. 我有两个不同尺寸的数据框。 I need to update msg_count in df1 from
df2
only if column value[UserId,Month] of df1
and df2
matches 仅当
df1
和df2
列值[UserId,Month]匹配时,才需要从df2
更新df1中的msg_count
My data is as follows: 我的数据如下:
df1:
UserID Month A B C D E F msg_count
knaas 1/1/2017 0 0 0 0 0 0 0
knaas 2/1/2017 0 0 0 0 0 0 0
knaas 3/1/2017 0 0 0 0 0 0 0
knaas 4/1/2017 0 0 0 2 0 0 0
knaas 5/1/2017 0 0 0 0 0 0 0
knaas 6/1/2017 0 0 0 0 0 0 0
knaas 7/1/2017 0 0 0 0 0 0 0
knaas 8/1/2017 0 0 0 0 0 0 0
knaas 9/1/2017 0 0 0 0 0 0 0
knaas 10/1/2017 0 0 0 0 0 0 0
knaas 11/1/2017 0 0 0 0 0 0 0
knaas 12/1/2017 0 0 0 0 0 0 0
ArtCort0324 1/1/2017 0 0 0 0 0 0 0
ArtCort0324 2/1/2017 0 2 0 2 0 0 0
ArtCort0324 3/1/2017 0 0 0 0 0 0 0
ArtCort0324 4/1/2017 0 1 1 0 0 0 0
ArtCort0324 5/1/2017 0 0 0 3 0 0 0
ArtCort0324 6/1/2017 0 0 0 0 0 0 9
df2:
UserID Month msg_count
ArtCort0324 1/1/2017 0
ArtCort0324 2/1/2017 0
ArtCort0324 3/1/2017 0
ArtCort0324 4/1/2017 0
ArtCort0324 5/1/2017 0
ArtCort0324 6/1/2017 9
ArtCort0324 7/1/2017 0
ArtCort0324 8/1/2017 0
ArtCort0324 9/1/2017 0
ArtCort0324 10/1/2017 0
ArtCort0324 11/1/2017 0
ArtCort0324 12/1/2017 0
I have tried the following code snippets. 我已经尝试了以下代码片段。 But it didn't work as expected
但是它没有按预期工作
res = df2.set_index(['UserID', 'Month'])\
.combine_first(df1.set_index(['UserID', 'Month']))\
.reset_index()
updated_new = df1.merge(gitter, how='left', on=['UserID', 'Month'],
suffixes=('', '_new'))
updated_new['msg_count'] =
np.where(pd.notnull(updated_new['msg_count_new']),
updated_new['msg_count_new'], updated_new['msg_count'])
I need the output as below 我需要以下输出
UserID Month A B C D E F msg_count
knaas 1/1/2017 0 0 0 0 0 0 0
knaas 2/1/2017 0 0 0 0 0 0 0
knaas 3/1/2017 0 0 0 0 0 0 0
knaas 4/1/2017 0 0 0 2 0 0 0
knaas 5/1/2017 0 0 0 0 0 0 0
knaas 6/1/2017 0 0 0 0 0 0 0
knaas 7/1/2017 0 0 0 0 0 0 0
knaas 8/1/2017 0 0 0 0 0 0 0
knaas 9/1/2017 0 0 0 0 0 0 0
knaas 10/1/2017 0 0 0 0 0 0 0
knaas 11/1/2017 0 0 0 0 0 0 0
knaas 12/1/2017 0 0 0 0 0 0 0
ArtCort0324 1/1/2017 0 0 0 0 0 0 0
ArtCort0324 2/1/2017 1 0 0 0 0 0 0
ArtCort0324 3/1/2017 0 0 0 0 0 0 50
ArtCort0324 4/1/2017 0 0 0 0 0 0 0
I have added a default column msg_count
to df1
with default value 0. I need to update the msg_count
from df1
with value of msg_count
from df2
, only if UserId
and Month
are equal in both dataframes 我添加了一个默认列
msg_count
到df1
用默认值0。我需要更新msg_count
从df1
与价值msg_count
从df2
,只有当UserId
和Month
在两种dataframes等于
It sounds like you want a merge
: 听起来您想要
merge
:
df_merge = pd.merge(left=df1, right=df2, on=['UserID', 'Month'], how='left']
You may want to set how as 'inner', 'outer'
, etc... 您可能需要将方式设置为
'inner', 'outer'
等。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.