[英]how to check consecutive difference in rows in pandas
I have following dataframe in pandas 我在熊猫中有以下数据框
code car_no date time error_code
123 6 2019-01-01 06:23:00 E09
123 6 2019-01-01 06:25:00 E28
123 6 2019-01-01 06:28:00 E09
123 6 2019-01-01 22:00:00 E28
123 7 2019-01-01 08:23:00 E09
123 6 2019-01-01 09:23:00 E09
123 7 2019-01-01 08:28:00 E28
What I want to flag, is for a specific code and car_no on the same date if E09 comes first and E28 come later with less than 2 hours difference
then flag should be set. 我要标记的是同一日期的特定代码和car_no,如果
E09 comes first and E28 come later with less than 2 hours difference
则应该设置标记。 My desired dataframe is as follows 我想要的数据帧如下
code car_no date time error_code flag
123 6 2019-01-01 06:23:00 E09 1
123 6 2019-01-01 06:25:00 E28 1
123 6 2019-01-01 06:28:00 E09 0
123 6 2019-01-01 22:00:00 E28 0
123 7 2019-01-01 08:23:00 E09 1
123 6 2019-01-01 09:23:00 E09 0
123 7 2019-01-01 08:28:00 E28 0
How can I do it in pandas? 我该怎么做在熊猫里?
Writing down your conditions and do it within groupby
, then we just need to assign it back 写下您的条件并在
groupby
,然后我们只需将其分配回去即可
#df.time=pd.to_timedelta(df.time)
s=df.groupby(['date','car_no']).\
apply(lambda x : x.error_code.eq('E28')&x.error_code.shift().eq('E09')&x.time.diff().dt.seconds.lt(60*60*2))
s=(s|s.groupby(level=[0,1]).shift(-1)).reset_index(level=[0,1],drop=True)
df['flag']=s
df
Out[126]:
code car_no date time error_code flag
0 123 6 2019-01-01 06:23:00 E09 True
1 123 6 2019-01-01 06:25:00 E28 True
2 123 6 2019-01-01 06:28:00 E09 False
3 123 6 2019-01-01 22:00:00 E28 False
4 123 7 2019-01-01 08:23:00 E09 True
5 123 6 2019-01-01 09:23:00 E09 False
6 123 7 2019-01-01 08:28:00 E28 True
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.