hi I have a dataframe like below:
ID date
1 01.01.2017
1 01.01.2017
1 01.04.2017
2 01.01.2017
2 01.01.2017
2 01.02.2017
What I want is to filter the id's which the related min and max of the dates' difference is 3 days. The final dataframe should be like this since only id 1 matches the condition:
ID date
1 01.01.2017
1 01.01.2017
1 01.04.2017
Thank you.
You can create a mask and then use it as a filter:
import pandas as pd
# create sample data-frame
data = [[1, '01.01.2017'], [1, '01.01.2017'], [1, '01.04.2017'],
[2, '01.01.2017'], [2, '01.01.2017'], [2, '01.02.2017']]
df = pd.DataFrame(data=data, columns=['id', 'date'])
df['date'] = pd.to_datetime(df.date)
# create mask
mask = df.groupby('id')['date'].transform(lambda x: (x.max() - x.min()).days == 3)
# filter
result = df[mask]
print(result)
Output
id date
0 1 2017-01-01
1 1 2017-01-01
2 1 2017-01-04
You can use Groupby.filter
with a custom lambda
function to check if the difference between the maximum date and the minimum is of 3
days:
d = datetime.timedelta(days=3)
df.groupby('ID').date.filter(lambda x: (x.max() - x.min()) == d)
ID
1 2017-01-01
1 2017-01-01
1 2017-01-04
Name: date, dtype: datetime64[ns]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.