[英]Pandas- Split dataset based on overlapping time periods
I have reporting time periods that start on Mondays, end on Sundays, and run for 5 weeks. 我报告的时间段从星期一开始,在星期日结束,持续5周。 For example:
例如:
11/20/2017 - 12/24/2017 = t1
11/27/2017 - 12/31/2017 = t2
I have a dataframe that consists of 6 of these periods (starting 11/20/2017) and I'm trying to split it into 6 dataframes for each time period using the LeaveDate
column. 我有一个由这些时间段中的6个组成的数据帧(从2017年11月20日开始),我正尝试使用
LeaveDate
列在每个时间段将其拆分为6个数据帧。 My data looks like this: 我的数据如下所示:
Barcode LeaveDate
ABC123 2017-11-22
ABC124 2017-12-04
ABC125 2017-12-15
As the dataframe is separated, some of the barcodes will fall into multiple periods- that's OK. 由于数据帧是分开的,所以某些条形码会落入多个周期中-没关系。 I know I can do:
我知道我可以做:
df['period'] = df['LeaveDate'].dt.to_period('M-SUN')
df['week'] = df['period'].dt.week
To get single weeks, but I don't know how to definte a "multi-week" period. 要获得单周的时间,但我不知道如何定义“多周”的时间段。 The problem also is that a barcode can full under multiple periods, so they need to be outputted to multiple dataframes.
问题还在于,条形码可能在多个时期内充满,因此需要将其输出到多个数据帧。 Any ideas?
有任何想法吗? Thanks!
谢谢!
There might be a more succinct solution, but this should work (will give you a dictionary of DataFrames, one for each period): 可能有一个更简洁的解决方案,但这应该可以工作(将为您提供一个DataFrames字典,每个时期一个):
df = pd.DataFrame([['ABC123', '2017-11-22'],
['ABC124', '2017-12-04'],
['ABC125', '2017-12-15']],
columns=['Barcode', 'LeaveDate'])
periods = [('2017-11-20', '2017-12-24'), ('2017-11-27', '2017-12-31')]
results = {}
for period in periods:
period_df = df[(df['LeaveDate'] >= period[0]) & (df['LeaveDate'] <= period[1])]
results[period] = period_df
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.