Subtract the union of two data frames

Question

Let's say I have two dataframes:

The first being a large list (2400+ values):

101  102  103  104   [index value]
"A"  "B"  "C"  "D"   [another string] 
"1"  "1"  "1"  "1"   [another string] 
"2"  "2"  "2"  "2"   [another string]

and then a second dataframe of disqualified values that I would like to remove from the first dataset, but might have some values that are not contained within the first dataframe:

101 104 205  [index value]
"A" "D" "Q"  [another string] 
"1" "1" "2"  [another string] 
"2" "2" "1"  [another string]

How would I take the union of these two (those that match) and remove them from the first dataframe? In this example I would want to end up with:

102  103   [index value]
"B"  "C"   [another string] 
"1"  "1"   [another string] 
"2"  "2"   [another string]

Answer 1

Assuming that you have a df with a certain index_column containing this index, and a disqualified (dsq) dataframe with a similar name column:

dsq = df_dsq['index_column'].to_list()
df_clean= df.loc[~df['index column'].isin(dsq), :].copy()

Subtract the union of two data frames

Question

1 answers

solution1
0 ACCPTED 2017-03-16 14:25:14

Subtract the union of two data frames

Question

1 answers

solution1 0 ACCPTED 2017-03-16 14:25:14

solution1
0 ACCPTED 2017-03-16 14:25:14