简体   繁体   English

熊猫:检查一个数据框的日期是否在另一个数据框的两个日期之间,并吸收值

[英]Pandas: check if date from one dataframe is between two dates from another dataframe and sobstitute values

I have 2 dataframes: 我有2个数据框:

df1 DF1

   date               event    group    failure
2018-04-19 02:07:00     1       E1         0
2018-04-19 02:07:00     2       E2         1

df2: DF2:

        start_time                   end_time           group      failure
2018-04-01 00:00:00+01:00   2018-04-01 23:59:59+01:00     E1         1
2018-04-27 19:00:00+01:00   2018-04-27 21:29:59+01:00     E1         1
2018-04-27 06:00:00+01:00   2018-04-27 12:59:59+01:00     E1         1
2018-04-26 19:00:00+01:00   2018-04-26 21:29:59+01:00     E1         1
2018-04-26 06:00:00+01:00   2018-04-26 12:59:59+01:00     E1         1
2018-04-25 19:00:00+01:00   2018-04-25 21:29:59+01:00     E1         1
2018-04-25 06:00:00+01:00   2018-04-25 12:59:59+01:00     E1         1
2018-04-24 19:00:00+01:00   2018-04-24 21:29:59+01:00     E1         1
2018-04-24 06:00:00+01:00   2018-04-24 12:59:59+01:00     E1         1
2018-04-23 19:00:00+01:00   2018-04-23 21:29:59+01:00     E1         1
2018-04-23 06:00:00+01:00   2018-04-23 12:59:59+01:00     E1         1
2018-04-16 00:00:00+01:00   2018-04-22 23:59:59+01:00     E1         1
2018-04-28 00:00:00+01:00   2018-04-29 23:59:59+01:00     E1         1
2018-04-07 00:00:00+01:00   2018-04-08 23:59:59+01:00     E1         1
2018-04-06 19:00:00+01:00   2018-04-06 21:29:59+01:00     E1         1
2018-04-06 06:00:00+01:00   2018-04-06 12:59:59+01:00     E1         1
2018-04-09 00:00:00+01:00   2018-04-15 23:59:59+01:00     E1         1
2018-04-05 19:00:00+01:00   2018-04-05 21:29:59+01:00     E1         1
2018-04-04 06:00:00+01:00   2018-04-04 12:59:59+01:00     E1         1
2018-04-03 06:00:00+01:00   2018-04-03 12:59:59+01:00     E1         1
2018-04-02 00:00:00+01:00   2018-04-02 23:59:59+01:00     E1         1
2018-04-04 19:00:00+01:00   2018-04-04 21:29:59+01:00     E1         1
2018-04-05 06:00:00+01:00   2018-04-05 12:59:59+01:00     E1         1
2018-04-03 19:00:00+01:00   2018-04-03 21:29:59+01:00     E1         1
2018-04-27 06:00:00+01:00   2018-04-27 12:59:59+01:00     E2         1
2018-04-02 00:00:00+01:00   2018-04-02 23:59:59+01:00     E2         1
2018-04-26 19:00:00+01:00   2018-04-26 21:29:59+01:00     E2         1
2018-04-25 06:00:00+01:00   2018-04-25 12:59:59+01:00     E2         1
2018-04-03 06:00:00+01:00   2018-04-03 12:59:59+01:00     E2         1
2018-04-26 06:00:00+01:00   2018-04-26 12:59:59+01:00     E2         1
2018-04-27 19:00:00+01:00   2018-04-27 21:29:59+01:00     E2         1
2018-04-01 00:00:00+01:00   2018-04-01 23:59:59+01:00     E2         1
2018-04-25 19:00:00+01:00   2018-04-25 21:29:59+01:00     E2         1
2018-04-03 19:00:00+01:00   2018-04-03 21:29:59+01:00     E2         1
2018-04-24 19:00:00+01:00   2018-04-24 21:29:59+01:00     E2         1
2018-04-04 06:00:00+01:00   2018-04-04 12:59:59+01:00     E2         1
2018-04-24 06:00:00+01:00   2018-04-24 12:59:59+01:00     E2         1
2018-04-23 19:00:00+01:00   2018-04-23 21:29:59+01:00     E2         1
2018-04-04 19:00:00+01:00   2018-04-04 21:29:59+01:00     E2         1
2018-04-23 06:00:00+01:00   2018-04-23 12:59:59+01:00     E2         1
2018-04-16 00:00:00+01:00   2018-04-22 23:59:59+01:00     E2         1
2018-04-05 06:00:00+01:00   2018-04-05 12:59:59+01:00     E2         1
2018-04-09 00:00:00+01:00   2018-04-15 23:59:59+01:00     E2         1
2018-04-07 00:00:00+01:00   2018-04-08 23:59:59+01:00     E2         1
2018-04-05 19:00:00+01:00   2018-04-05 21:29:59+01:00     E2         1
2018-04-06 19:00:00+01:00   2018-04-06 21:29:59+01:00     E2         1
2018-04-06 06:00:00+01:00   2018-04-06 12:59:59+01:00     E2         1
2018-04-28 00:00:00+01:00   2018-04-29 23:59:59+01:00     E2         1

I have to check if: 我必须检查是否:

  • df1(date) is between df2(start_time) and df2(end_time) df1(日期)在df2(开始时间)和df2(结束时间)之间

  • df1(group)=df2(group) DF1(组)= DF2(组)

then replace df2(failure) with df1(failure). 然后将df2(失败)替换为df1(失败)。 The desired outcome looks like: 所需的结果如下所示:

        start_time                   end_time           group      failure
2018-04-01 00:00:00+01:00   2018-04-01 23:59:59+01:00     E1         1
2018-04-27 19:00:00+01:00   2018-04-27 21:29:59+01:00     E1         1
2018-04-27 06:00:00+01:00   2018-04-27 12:59:59+01:00     E1         1
2018-04-26 19:00:00+01:00   2018-04-26 21:29:59+01:00     E1         1
2018-04-26 06:00:00+01:00   2018-04-26 12:59:59+01:00     E1         1
2018-04-25 19:00:00+01:00   2018-04-25 21:29:59+01:00     E1         1
2018-04-25 06:00:00+01:00   2018-04-25 12:59:59+01:00     E1         1
2018-04-24 19:00:00+01:00   2018-04-24 21:29:59+01:00     E1         1
2018-04-24 06:00:00+01:00   2018-04-24 12:59:59+01:00     E1         1
2018-04-23 19:00:00+01:00   2018-04-23 21:29:59+01:00     E1         1
2018-04-23 06:00:00+01:00   2018-04-23 12:59:59+01:00     E1         1
2018-04-16 00:00:00+01:00   2018-04-22 23:59:59+01:00     E1         0
2018-04-28 00:00:00+01:00   2018-04-29 23:59:59+01:00     E1         1
2018-04-07 00:00:00+01:00   2018-04-08 23:59:59+01:00     E1         1
2018-04-06 19:00:00+01:00   2018-04-06 21:29:59+01:00     E1         1
2018-04-06 06:00:00+01:00   2018-04-06 12:59:59+01:00     E1         1
2018-04-09 00:00:00+01:00   2018-04-15 23:59:59+01:00     E1         1
2018-04-05 19:00:00+01:00   2018-04-05 21:29:59+01:00     E1         1
2018-04-04 06:00:00+01:00   2018-04-04 12:59:59+01:00     E1         1
2018-04-03 06:00:00+01:00   2018-04-03 12:59:59+01:00     E1         1
2018-04-02 00:00:00+01:00   2018-04-02 23:59:59+01:00     E1         1
2018-04-04 19:00:00+01:00   2018-04-04 21:29:59+01:00     E1         1
2018-04-05 06:00:00+01:00   2018-04-05 12:59:59+01:00     E1         1
2018-04-03 19:00:00+01:00   2018-04-03 21:29:59+01:00     E1         1
2018-04-27 06:00:00+01:00   2018-04-27 12:59:59+01:00     E2         1
2018-04-02 00:00:00+01:00   2018-04-02 23:59:59+01:00     E2         1
2018-04-26 19:00:00+01:00   2018-04-26 21:29:59+01:00     E2         1
2018-04-25 06:00:00+01:00   2018-04-25 12:59:59+01:00     E2         1
2018-04-03 06:00:00+01:00   2018-04-03 12:59:59+01:00     E2         1
2018-04-26 06:00:00+01:00   2018-04-26 12:59:59+01:00     E2         1
2018-04-27 19:00:00+01:00   2018-04-27 21:29:59+01:00     E2         1
2018-04-01 00:00:00+01:00   2018-04-01 23:59:59+01:00     E2         1
2018-04-25 19:00:00+01:00   2018-04-25 21:29:59+01:00     E2         1
2018-04-03 19:00:00+01:00   2018-04-03 21:29:59+01:00     E2         1
2018-04-24 19:00:00+01:00   2018-04-24 21:29:59+01:00     E2         1
2018-04-04 06:00:00+01:00   2018-04-04 12:59:59+01:00     E2         1
2018-04-24 06:00:00+01:00   2018-04-24 12:59:59+01:00     E2         1
2018-04-23 19:00:00+01:00   2018-04-23 21:29:59+01:00     E2         1
2018-04-04 19:00:00+01:00   2018-04-04 21:29:59+01:00     E2         1
2018-04-23 06:00:00+01:00   2018-04-23 12:59:59+01:00     E2         1
2018-04-16 00:00:00+01:00   2018-04-22 23:59:59+01:00     E2         1
2018-04-05 06:00:00+01:00   2018-04-05 12:59:59+01:00     E2         1
2018-04-09 00:00:00+01:00   2018-04-15 23:59:59+01:00     E2         1
2018-04-07 00:00:00+01:00   2018-04-08 23:59:59+01:00     E2         1
2018-04-05 19:00:00+01:00   2018-04-05 21:29:59+01:00     E2         1
2018-04-06 19:00:00+01:00   2018-04-06 21:29:59+01:00     E2         1
2018-04-06 06:00:00+01:00   2018-04-06 12:59:59+01:00     E2         1
2018-04-28 00:00:00+01:00   2018-04-29 23:59:59+01:00     E2         1

I have tried with if functions, but I get the error: Can only compare identically-labeled Series objects. 我已经尝试过if函数,但是却收到错误消息:只能比较标记相同的Series对象。 Any suggestion? 有什么建议吗? Thank you in advance! 先感谢您!

I could compare the dates after doing the following:- 执行以下操作后,我可以比较日期:-

e1['date'] = e1['date'].apply( lambda x: pd.to_datetime(x).tz_localize('US/Eastern'))
e2['start_time'] = e2['start_time'].apply( lambda x: 
pd.to_datetime(x).tz_localize('US/Eastern'))
e2['end_time'] = e2['end_time'].apply( lambda x: pd.to_datetime(x).tz_localize('US/Eastern'))

I merged both tables and then checked if date is between start time and end time to replace failure variable. 我合并了两个表,然后检查日期是否在开始时间和结束时间之间,以替换故障变量。

failure_x is of E2 while failure_y is of E1 dataframes:- failure_x是E2,而failure_y是E1数据帧:-

df = e2.merge(e1,on='group',how='left')
df['failure_x'] = np.where((df['start_time'] <= df['date']) & (df['date'] <=  df['end_time']), df['failure_y'], df['failure_x'])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查一个 dataframe 中的日期是否在另一个 dataframe 中的两个日期之间,按组 - Check if date in one dataframe is between two dates in another dataframe, by group Python Pandas 如何比较一个 Dataframe 中的日期与另一个 ZC699575A5E8AFD9E22A7ECC8CAB 中的日期? - Python Pandas how to compare date from one Dataframe with dates in another Dataframe? Pandas 如果一个值介于另一个数据帧的两个值之间,则过滤一个 dataframe - Pandas Filtering one dataframe if a value is between two values from another data frame 根据日期与另一个DataFrame之间的日期加入DataFrame - Join DataFrame based on date which is between dates from another DataFrame Pandas:从 Pandas DataFrame 中选择两个日期之间的所有数据 - Pandas: Select all data from Pandas DataFrame between two dates 用另一个数据框中的值替换一个熊猫数据框中的值 - Replacing values in one pandas dataframe with values from another dataframe Pandas:使用基于两列的另一个数据帧中的值替换一个数据帧中的值 - Pandas: replace values in one dataframe with values from another dataframe based on two columns 从另一个数据帧中减去一个Pandas Dataframe中的属性值 - Subtracting values of attributes within one Pandas Dataframe from another dataframe 如何使用熊猫中另一个数据框的值更新一个数据框 - How to update one dataframe using values from another dataframe in pandas 使用 pandas 从一个 dataframe 在另一个 dataframe 中搜索值 - searching values from one dataframe in another dataframe using pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM