简体   繁体   中英

How to find row with same value in 2 columns between 2 dataframes but different values in other columns pandas

I have 2 dataframes with sample value as below :

df1 :
col1 cold2 cold3 cold4
a     bb    cc    d
b     aa    ee    e


df2 :
col1 cold2 cold3 col4
a    ee    ff    d
e    gg    hh    k

i want to find all row in 2 dataframes have same value in col1+col4 but different value in col2 or col3

output should like that :

df3:
col1 cold2 cold3 cold4
a     bb    cc    d
a     ee    ff    d

Thanks for help.

Here is a solution using duplicated and drop_duplicates . You first have to concatenate the two dataframes, for which you have to make sure that the column names are the same.

If your column names are actually matching in df1 and df2 , do:

new_df = (pd.concat([df1,df2])[pd.concat([df1,df2])
                             .duplicated(subset=['col1','cold4'], keep=False)]
           .drop_duplicates(subset=['cold2', 'cold3']))

Which returns:

>>> new_df

  col1 cold2 cold3 cold4
0    a    bb    cc     d
0    a    ee    ff     d

If you need to rename your columns in df2 to match the column names of df1 without modifying the original dataframes, you can simply add this step:

concat_dfs = pd.concat([df1, df2.rename(columns={i2:i1 for i1,i2
                                         in zip(df1.columns,df2.columns)})])

new_df = (concat_dfs[concat_dfs.duplicated(subset=['col1', 'cold4'], keep=False)]
           .drop_duplicates(subset=['cold2', 'cold3']))

I think you can use:

#get all matched rows by columns
df = df1.merge(df2, on=['col1','col4'], suffixes=('','_'))
#filter for not matched  
df = df[df['col2'] != df['col3']]

#filter columns - same like df1
df1 = df[df1.columns]
#filter added new columns
df2 = df[df.columns.difference(df2.columns).union(['col1','col4'])]

#join together - rename values for align data 
df = pd.concat([df1, 
                df2.rename(columns=dict(zip(df2.columns, df1.columns)))],
                ignore_index=True)
print (df)
  col1 col2 col3 col4
0    a   bb   cc    d
1    a   ee   ff    d

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM