Find the difference (set difference) between two dataframes in python

Question

I have two dataframes: df1 and df2. I want to eliminate all occurrences of df2 rows in df1. Basically, this is the set difference operator but for dataframes.

My ask is very similar to this question with one major variation that its possible that df1 may have no common rows at all. In that case, if we concat the two dataframes and then drop the duplicates, it still doesn't eliminate df2 occurrences in df1. Infact it adds to it.

The question is also similar to this , except that I want my operation on the rows.

Example:

Case 1:
df1:
A,B,C,D
E,F,G,H

df2:
E,F,G,H

Then, df1-df2:
A,B,C,D

Case 2:
df1:
A,B,C,D

df2:
E,F,G,H

Then, df1 - df2:
A,B,C,D

Spoken simply, I am looking for a way to do df1 - df2 (remove df2 if present in df1). How should this be done?

Answer 1

try:

df1[~df1.isin(df2)]

A,B,C,D

Answer 2

Set difference will work here, it returns unique values in ar1 that are not in ar2.

np.setdiff1d(df1, df2)

Or to get the result in form of DataFrame,

pd.DataFrame([np.setdiff1d(df1, df2)])

Find the difference (set difference) between two dataframes in python

Question

2 answers

solution1
4 ACCPTED 2019-01-25 19:44:42

solution2
4 2019-01-25 19:59:30

Find the difference (set difference) between two dataframes in python

Question

2 answers

solution1 4 ACCPTED 2019-01-25 19:44:42

solution2 4 2019-01-25 19:59:30

solution1
4 ACCPTED 2019-01-25 19:44:42

solution2
4 2019-01-25 19:59:30