I have the following DataFrame df
:
id datetime_event cameraid platenumber
11 2017-05-01T00:00:08 AAA 11A
12 2017-05-01T00:00:08 AAA 223
13 2017-05-01T00:00:08 BBB 11A
14 2017-05-01T00:00:09 BBB 33D
15 2017-05-01T00:00:09 DDD 44F
16 2017-05-01T01:01:00 AAA 44F
17 2017-05-01T01:01:01 BBB 44F
18 2017-05-01T01:01:09 AAA 556
19 2017-05-01T01:01:09 AAA 778
20 2017-05-01T01:01:11 EEE 666
For each hour of each day I want to select up to 100 entries that have title
in (AAA, BBB) and the same platenumber
appears sequentially first in AAA
and secondly in BBB
.
For example, for the above-given example DataFrame the output would be this one:
id datetime_event cameraid platenumber
11 2017-05-01T00:00:08 AAA 11A
13 2017-05-01T00:00:08 BBB 11A
16 2017-05-01T01:01:00 AAA 44F
17 2017-05-01T01:01:01 BBB 44F
The first 100 entries for each hour of each day can be extracted in the following way:
df = df[df.groupby(pd.to_datetime(df['datetime_event']).dt.floor('H')).cumcount() < 100]
However, how can I filter by title
and (which is most important) how to merge by platenumber
, so that the same platenumber values appear subsequently, firstly, in AAA
and then in BBB
?
Use filter :
EDIT:
#first filter only AAA, BBB for less data
df = df[df['cameraid'].isin(['AAA','BBB'])]
df1 = (df.groupby([pd.to_datetime(df['datetime_event']).dt.floor('H'),'platenumber'])
.filter(lambda x: (x['cameraid'].values == ['AAA','BBB']).all()))
print (df1)
d datetime_event cameraid platenumber
0 11 2017-05-01T00:00:08 AAA 11A
2 13 2017-05-01T00:00:08 BBB 11A
5 16 2017-05-01T01:01:00 AAA 44F
6 17 2017-05-01T01:01:01 BBB 44F
Old solution:
#first filter only AAA, BBB for less data
df = df[df['cameraid'].isin(['AAA','BBB'])]
#filter only 2 size groups and check if 1. value is AAA and 2. BBB
def f(x):
return len(x) == 2 and \
x['cameraid'].iat[0] == 'AAA' and \
x['cameraid'].iat[1] == 'BBB'
df = df.groupby([pd.to_datetime(df['datetime_event']).dt.floor('H'),'platenumber']).filter(f)
print (df)
d datetime_event cameraid platenumber
0 11 2017-05-01T00:00:08 AAA 11A
2 13 2017-05-01T00:00:08 BBB 11A
5 16 2017-05-01T01:01:00 AAA 44F
6 17 2017-05-01T01:01:01 BBB 44F
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.