I have two dataframes and df2 is more columns
If the row in df1 doesn't have in df2, I select it to df3
df1
id colA colB
0 1 4 1
1 2 5 2
2 3 2 4
3 4 4 2
4 5 2 4
df2
id colA colB colC
0 1 4 1 0
1 2 5 2 0
2 5 2 4 0
I want select some rows from df1
df3
id colA colB
0 3 2 4
1 4 4 2
Assuming you are comparing on the 'id'
column (if not, please clarify), you can use Series.isin
with boolean indexing.
>>> df3 = df1[~df1['id'].isin(df2['id'])]
>>> df3
id colA colB
2 3 2 4
3 4 4 2
Use drop_duplicates
:
import pandas as pd
df1 = pd.DataFrame({'id': [1,2,3,4,5],
'colA':[4,5,2,4,2],
'colB':[1,2,4,2,4]})
df2 = pd.DataFrame({'id': [1,2,5],
'colA':[4,5,2],
'colB':[1,2,4])
pd.concat([df1,df2]).drop_duplicates(subset='id',keep=False)
Output:
id colA colB
2 3 2 4
3 4 4 2
df3 = df1.loc[~df1['id'].isin(list(df2['id']))]
Output:
id colA colB
2 3 2 4
3 4 4 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.