简体   繁体   中英

Subtract the union of two data frames

Let's say I have two dataframes:

The first being a large list (2400+ values):

101  102  103  104   [index value]
"A"  "B"  "C"  "D"   [another string] 
"1"  "1"  "1"  "1"   [another string] 
"2"  "2"  "2"  "2"   [another string] 

and then a second dataframe of disqualified values that I would like to remove from the first dataset, but might have some values that are not contained within the first dataframe:

101 104 205  [index value]
"A" "D" "Q"  [another string] 
"1" "1" "2"  [another string] 
"2" "2" "1"  [another string] 

How would I take the union of these two (those that match) and remove them from the first dataframe? In this example I would want to end up with:

102  103   [index value]
"B"  "C"   [another string] 
"1"  "1"   [another string] 
"2"  "2"   [another string] 

Assuming that you have a df with a certain index_column containing this index, and a disqualified (dsq) dataframe with a similar name column:

dsq = df_dsq['index_column'].to_list()
df_clean= df.loc[~df['index column'].isin(dsq), :].copy()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM