简体   繁体   中英

Pandas drop rows in one dataframe that share a common value with a rows in a column of another dataframe

I have a data frame called x1:

 FID  g1    g2    g3
  0   19    20    13
  1   16    11    14
  2   15    20    11

and a data frame called x2:

 FID  g1   
  1   16  

I want to alter x1 so that it does not include the row in x2:

 FID  g1    g2    g3
  0   19    20    13
  2   15    20    11

I've tried:

x1 = pd.concat([x1,x2]).drop_duplicates(keep=False)

but think this only works if the data frames have matching schemas. Can I just keep rows in x1 that don't share the FID values in x2?

You can use pd.Series.isin to create a boolean series of the values in the FID column of x1 that appear in the FID column of x2 .

Then simply use pd.DataFrame.loc with the operator ~ to invert the boolean series and select the rows of x1 whose values in the FID column don't appear in the FID column of x2 :

cond = x1.FID.isin(x2.FID)
x1.loc[~cond] 
# output:
   FID  g1  g2  g3
0    0  19  20  13
2    2  15  20  11
x1.drop(x2['FID'])

output

    FID  g1  g2  g3
 0    0  19  20  13
 2    2  15  20  11

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM