简体   繁体   English

如何仅将 dataframe 中的行添加到某些列中的值不匹配的另一行

[英]How to only add rows from one dataframe to another where values don't match in certain columns

I have two dataframe, df1 and df2 (shown below) and I wanted df3.我有两个 dataframe,df1 和 df2(如下所示),我想要 df3。 So basically, if duplicates appear between both and column "Complete" == 'C' then remove row from df1, otherwise keep df1 rows and add remaining rows from df2.所以基本上,如果重复出现在“完成”列 == 'C' 之间,则从 df1 中删除行,否则保留 df1 行并从 df2 添加剩余行。 Hopefully this makes sense?希望这是有道理的? There may be a simple way to do this and I'm just making sound more complicated than it actually is!?可能有一种简单的方法可以做到这一点,而我只是让声音比实际更复杂!?

df1:

Complete    Name      Birth
C           Steve     13/07/2000
C           Mike      13/06/2000
C           Sarah     20/05/1936
C           Lewis     14/08/1955
NaN         Martin    15/04/1990
NaN         Lewis     15/04/1990


df2:

Complete    Name      Birth
NaN         Steve     13/07/2000
NaN         Mike      13/06/2000
NaN         Sarah     20/05/1936
NaN         Lewis     14/08/1955
NaN         Martin    15/04/1990
NaN         Lewis     15/04/1990
NaN         Dave      13/04/1935
NaN         Mark      14/07/1932
NaN         Steve     15/06/1970

I wish for df1 to therefore become:我希望 df1 因此成为:

Complete    Name      Birth
NaN         Martin    15/04/1990
NaN         Lewis     15/04/1990
NaN         Dave      13/04/1935
NaN         Mark      14/07/1932
NaN         Steve     15/06/1970
# merge both dataframes, 2 tricks, .reset_index()...set_index() will keep the original index and not reset him
# trick 2, use indicator=True which creates the column "_merge" where you can see in which dataframe the rows where found, left, right or both
df = df1.reset_index().merge(df2, on=["Complete", "Name", "Birth"], how="left", indicator=True).set_index("index")
# creates a mask (series with True / False values)
mask = (df["_merge"]=="both") & (df["Complete"] == "C")
# only keep rows where mask == True, the "~" inverts the boolean value, therefore excludes the mask
df = df[~mask]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当索引和列不匹配时,如何使用另一个 dataframe 的值更新 dataframe - How to update a dataframe with values from another dataframe when indexes and columns don't not match 当某些列具有相同的值时,将特定行中的值从一个 DataFrame 替换为另一个 - Replace values in specific rows from one DataFrame to another when certain columns have the same values Select 仅来自 Dataframe 的那些行,其中某些带有后缀的列的值不等于零 - Select only those rows from a Dataframe where certain columns with suffix have values not equal to zero 如果列值与列表中的所有值不匹配,如何从数据框中删除行 - How to remove rows from dataframe if column values don't match with all the values on the list 如何从 dataframe 中删除行,其中另一个 dataframe 的数据不匹配 - How to remove rows from dataframe where data from another dataframe DOESN'T match Python 将多个列中的值从一个 dataframe 添加到另一个 dataframe 如果它不存在 - Python add values from multiple columns from one dataframe to another dataframe if it doesn't exists 某些列具有最大值之一的 select 行如何 - How select rows where certain columns have one of the largest values 如何仅将行从一个数据帧移动到第二个数据帧中不存在 ID 的另一个数据帧? - How to only move rows from one dataframe to another where ID is not present in the second dataframe? 有效地检查与Pandas DataFrame中某些值匹配的行,并将其添加到另一个Dataframe中 - Efficiently check rows that match certain values in Pandas DataFrame and add it to another dataframe 如何从 DataFrame 中删除某些列只有零值的行 - How to remove rows from a DataFrame where some columns only have zero values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM