[英]replace dataframe column values with values from an other dataframe column
Let me explain my problem a bit more.让我再解释一下我的问题。 I have a dataframe with ID, name and surname, let's call him df_src ex:我有一个 dataframe,有 ID、姓名和姓氏,我们称他为 df_src ex:
ID Name Surname
177015H LAURE Thomas
198786X ANGEARD Audrey
136235G EYSSERIC Laurent
198786X ANGEARD Audrey
In this dataframe i have multiple values that are duplicated.在这个 dataframe 中,我有多个重复的值。 Due to the fact that a person can manage diffrent people.因为一个人可以管理不同的人。
In the other hand my second dataframe contains each of previous rows without the duplicated values + pseudonymize data, let's call him df_tem ex:另一方面,我的第二个 dataframe 包含前面的每一行,没有重复值 + 假名化数据,我们称他为 df_tem ex:
ID Name Surname FakeID FakeName FakeSurname
177015H LAURE Thomas 127345H ELOR Lori
198786X ANGEARD Audrey 112846X RELARD Pierre
136235G EYSSERIC Laurent 108456G SERIC Marc
... ... ... .... ... ...
What i want to accomplish here is to replace all values from df_src that are similar to the one on df_tem by the fake value.我在这里想要完成的是用假值替换 df_src 中与 df_tem 上的值相似的所有值。 For ex Replace all duplicated values of 177015H LAURE Thomas by 127345H ELOR Lori and so on.对于 ex 将 177015H LAURE Thomas 的所有重复值替换为 127345H ELOR Lori 等等。
I try to use我尝试使用
df_src.replace(to_replace=dfsrc['column'], value=df_tem['column'], inplace=True)
just to have none in return.只是为了没有回报。 It's been several hour that i'm on it without being able to find a way of doing it with pandas.我花了好几个小时都没有找到使用 pandas 的方法。
Do you have any idea?你有什么主意吗? Any hep will be appreciated.任何帮助将不胜感激。
I would merge both and then rename the columns:我会合并两者,然后重命名列:
df = df_src.merge(df_tem, on=["ID", "Name", "Surname"], how="left"
).drop(columns=["ID", "Name", "Surname"]
).rename(columns={"FakeID": "ID", "FakeName": "Name", "FakeSurname": "Surname"})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.