The code below groups the dataframe by a key.
df = pd.DataFrame(data, columns=['id', 'date', 'cnt'])
df['date']= pd.to_datetime(df['date'])
for c_id, group in df.groupby('id'):
print(c_id)
print(group)
This produces a result like this:
id date cnt
1 2019-01-02 1
1 2019-01-03 2
1 2019-01-04 3
1 2019-01-05 1
1 2019-01-06 2
1 2019-01-07 1
id date cnt
2 2019-01-01 478964
2 2019-01-02 749249
2 2019-01-03 1144842
2 2019-01-04 1540846
2 2019-01-05 1444918
2 2019-01-06 1624770
2 2019-01-07 2227589
id date cnt
3 2019-01-01 41776
3 2019-01-02 82322
3 2019-01-03 93467
3 2019-01-04 56674
3 2019-01-05 47606
3 2019-01-06 41448
3 2019-01-07 145827
id date cnt
4 2019-01-01 41776
4 2019-01-02 82322
4 2019-01-06 93467
4 2019-01-07 56674
From this result, I want to find the maximum consecutive number of days for each id. So id 1 would be 6, id 2 would be 7, id 3 would be 7, and id 4 would be 2.
Use:
m = (df.assign(date=pd.to_datetime(df['date'])) #if necessary convert else drop
.groupby('id')['date']
.diff()
.gt(pd.Timedelta('1D'))
.cumsum())
df.groupby(['id', m]).size().max(level='id')
Output
id
1 6
2 7
3 7
4 2
dtype: int64
To get your result, run:
result = df.groupby('id').apply(lambda grp: grp.groupby(
(grp.date.shift() + pd.Timedelta(1, 'd') != grp.date).cumsum())
.id.count().max())
Details:
df.groupby('id')
- First level grouping (by id ). grp.groupby(...)
- Second level grouping (by sequences of consecutive dates. grp.date.shift()
- Date from the previous row. + pd.Timedelta(1, 'd')
- Shifted by 1 day. != grp.date
- Not equal to the current date. The result is a Series with True on the start of each sequence of consecutive dates.cumsum()
- Convert the above ( bool ) Series to a Series of int - consecutive numbers of above sequences, starting from 1. id
- Take id column from each (second level) group. count()
- Compute the size of the current group. .max()
- Take max from sizes of second level groups (within the current level 1 group).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.