简体   繁体   English

比较来自不同Pandas数据框的列,并替换其值<Pandas, Python>

[英]Compare columns from different Pandas dataframes, and replace its values <Pandas, Python>

I have a two similar data frames (named dfA, dfB) that has ID and code. 我有两个具有ID和代码的类似数据帧(名为dfA,dfB)。 ID represents patient ID, and code is disease code. ID代表患者ID,代码是疾病代码。 dfA is larger and all dfB are actually in dfA. dfA较大,并且所有dfB实际上都在dfA中。 However, the disease code in dfA are somewhat old, and needs to be updated by code in dfB. 但是,dfA中的疾病代码有些旧,需要通过dfB中的代码进行更新。

My task is comparing all ID row of dfA with dfB, and if there're matching ID, changing the value of disease code in dfA with dfB. 我的任务是比较dfA和dfB的所有ID行,如果存在匹配的ID,则用dfB更改dfA中疾病代码的值。 So that the code is properly updated. 以便正确更新代码。 Final result is the list of dfA (with updated code). 最终结果是dfA列表(带有更新的代码)。

For regular python, it would look like this: 对于普通的python,它将如下所示:

for i in dfA.id:
    if i == dfB.id:
        replace dfA.code with dfB.code 
    else: pass 
print(dfA)

I know Pandas loop has different rules, and one can process this without making a loop. 我知道Pandas循环有不同的规则,无需循环就可以处理此规则。 I've done something like this, but does not work. 我已经做了类似的事情,但是没有用。

dfA.where(dfA.id == dfB.id), 'code'][dfB.code]

Could you shed some lights on this? 您能对此一点点启发吗? Thank you 谢谢

Using 使用

s=dfA.id.map(dfB.set_index('id').code)
dfA.code=s.fillna(dfA.code)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM