简体   繁体   中英

Find common elements in two dataframes

I have two dataframes df1 and df2 having the same columns. I would like to find the elements within a column that are in common. For example

df1 :

Col1 Col2 Nam1 Nam2 Net
AD    AS   AS   ADS AB
BF    SA   WQ   AFW AF
RW    KJ   IQ   QIE LK

df2 :

Col1 Col2 Nam1 Nam2 Net
RW    WQ   HF   HGJ AB
BF    AS   DD   VCC LJ
RW    KJ   IQ   ADS JH
DS    QW   LJ   NB  LK

I would like to have the following (on Net)

Col1 Col2 Nam1 Nam2 Net
AD    AS   AS   ADS AB
RW    WQ   HF   HGJ AB
RW    KJ   IQ   QIE LK
DS    QW   LJ   NB  LK

I have tried as follows:

df=pd.merge(df1, df2, on='Net', how='inner')

but it duplicates all the columns' name (_x and _y) and also it seems not extracting only the elements in common.

IIUC, you just want those rows which share the same net value? You can start with,

vals =  set(df1['Net']).intersection(df2['Net']) 
print (vals)
# {'AB', 'LK'}

Now, filter out those values and concatenate:

pd.concat([
    df1.query('Net in @vals'),   
    df2.query('Net in @vals')], ignore_index=True)

  Col1 Col2 Nam1 Nam2 Net
0   AD   AS   AS  ADS  AB
1   RW   KJ   IQ  QIE  LK
2   RW   WQ   HF  HGJ  AB
3   DS   QW   LJ   NB  LK

One explicit way to do it would be:

  1. Obtain set of common items
common_items = set(df1['Net']) & set(df2['Net'])
  1. Get relevant rows from dataframes
df1_common = df1[df1['Net'].isin(common_items)]

Col1 Col2 Nam1 Nam2 Net
AD    AS   AS   ADS AB
RW    KJ   IQ   QIE LK
df2_common = df2[df2['Net'].isin(common_items)] 

Col1 Col2 Nam1 Nam2 Net
RW    WQ   HF   HGJ AB
DS    QW   LJ   NB  LK
  1. Concatenate the two dataframes
pd.concat([df1_common, df2_common])
    
Col1 Col2 Nam1 Nam2 Net
AD    AS   AS   ADS AB
RW    WQ   HF   HGJ AB
RW    KJ   IQ   QIE LK
DS    QW   LJ   NB  LK

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM