How do I remove all rows that contain the same dates as another dataframe? I want to keep unique rows with all columns between the two dataframes. Also, i cannot use a merge.
import pandas as pd
from datetime import timedelta
df1 = pd.DataFrame({
'date': ['2001-02-01','2001-02-02','2001-02-03', '2001-02-04'],
'value': [101, 201, 310, 410]})
df2 = pd.DataFrame({
'date': ['2001-02-03','2001-02-04','2001-02-05', '2001-02-05'],
'value': [121, 231, 610, 990]})
df1['date'] = pd.to_datetime(df1['date'])
df2['date'] = pd.to_datetime(df2['date'])
Returns:
date value
0 2001-02-01 101
1 2001-02-02 201
2 2001-02-03 310
3 2001-02-04 410
---
date value
0 2001-02-03 121
1 2001-02-04 231
2 2001-02-05 610
3 2001-02-06 990
Desired dataframe:
print(df3)
date value
0 2001-02-01 101
1 2001-02-02 201
2 2001-02-05 610
3 2001-02-06 990
I tried df1[~df1.date.notin(df2.date)]
, but this throws an error: AttributeError: 'Series' object has no attribute 'notin'
I also tried df1[~df1.date.isin(df2.date) == False]
and this returns:
date value
2 2001-02-03 310
3 2001-02-04 410
Concatenate the two then drop the duplicate dates:
df3 = pd.concat([df1, df2]).drop_duplicates(subset='date', keep=False)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.