简体   繁体   English

检查熊猫数据框中两列之间的日期和时间

[英]check for date and time between two columns in pandas data frame

I have two data frames: 我有两个数据框:

The first date frame is: 第一个日期框架是:

import pandas as pd
df1 = pd.DataFrame({'serialNo':['aaaa','bbbb','cccc','ffff','aaaa','bbbb','aaaa'],
               'Name':['Sayonti','Ruchi','Tony','Gowtam','Toffee','Tom','Sayonti'],
               'testName':   [4402, 3747 ,5555,8754,1234,9876,3602],
               'moduleName':   ['singing', 'dance','booze', 'vocals','drama','paint','singing'],
               'endResult': ['WARNING', 'FAILED', 'WARNING', 'FAILED','WARNING','FAILED','WARNING'],
               'Date':['2018-10-5','2018-10-6','2018-10-7','2018-10-8','2018-10-9','2018-10-10','2018-10-8'],
               'Time_df1':['23:26:39','22:50:31','22:15:28','21:40:19','21:04:15','20:29:11','19:54:03']})

The second data frame is: 第二个数据帧是:

df2 = pd.DataFrame({'serialNo':['aaaa','bbbb','aaaa','ffff','xyzy','aaaa'],
               'Food':['Strawberry','Coke','Pepsi','Nuts','Apple','Candy'],
               'Work':   ['AP', 'TC','OD', 'PU','NO','PM'],
               'Date':['2018-10-1','2018-10-6','2018-10-2','2018-10-3','2018-10-5','2018-10-10'],
               'Time_df2':['09:00:00','10:00:00','11:00:00','12:00:00','13:00:00','14:00:00']
               })

I am joining the two based on serial number: 我将根据序列号加入两者:

df1['Date'] = pd.to_datetime(df1['Date'])
df2['Date'] = pd.to_datetime(df2['Date'])
result = pd.merge(df1,df2,on=['serialNo'],how='inner')

Now I want that Date_y lies within 3 days of Date_x starting from Date_x which means Date_X+(1,2,3 days) should be Date_y. 现在,我希望Date_y位于Date_x的3天内(从Date_x开始),这意味着Date_X +(1,2,3天)应为Date_y。 And I can get that as below but I also want to check for the time range which I do not know how to achieve 我可以如下所示,但我也想检查我不知道如何实现的时间范围

result = result[result.Date_x.sub(result.Date_y).dt.days.between(0,3)]

I want to check for the time such that Time_df2 is within 6 hours of start time being Time_df1. 我想检查Time_df2是否在开始时间为Time_df1的6小时内。 Please help? 请帮忙?

You could have a column within your dataframe that combines the date and the time. 您可能在数据框中有一个合并日期和时间的列。 Here's an example of combining a single row in the dataframe: 这是在数据框中合并一行的示例:

# Combining Date_x and time_df1
value_1_x = datetime.datetime.combine(result['Date_x'][0].date() ,\
datetime.datetime.strptime(result['Time_df1'][0], '%H:%M:%S').time())

# Combining date_y and time_df2
value_2_y = datetime.datetime.combine(result['Date_y'][0].date() , \
datetime.datetime.strptime(result['Time_df2'][0], '%H:%M:%S').time())

Then given two datetime objects, you can simply subtract to find the difference you are looking for: 然后给定两个日期时间对象,您可以简单地减去以找到所需的差值:

difference = value_1_x - value_2_y
print(difference)

Which gives the output: 给出输出:

4 days, 14:26:39

My understanding is that you are looking to see if something is within 3 days and 6 hours (or a total of 78 hours). 我的理解是,您希望查看是否在3天6个小时(或总共78个小时)之内。 You can convert this to hours easily, and then make the desired comparison: 您可以轻松地将其转换为小时数,然后进行所需的比较:

hours_difference = abs(value_1_x - value_2_y).total_seconds() / 3600.0
print(hours_difference)

Which gives the output: 给出输出:

110.44416666666666

Hope that helps! 希望有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM