[英]Python - Group by dates
希望加快这项任务....它的工作原理,只是慢慢地。
#split csv file into two groups.
for index, row in tqdm(df.iterrows(), total=df.shape[0]):
date_time_obj = datetime.datetime.strptime(row["date"], '%Y-%m-%d')
if date_time_obj <= datetime.datetime.strptime("2020-03-11", '%Y-%m-%d'):
group = "before"
else:
group = "after"
df.loc[index, "group"] = group
df.loc[index, "month"] = date_time_obj.month
ans=[y for x, y in df.groupby('group', as_index=False)]
为了加快速度,您可以以矢量化形式进行(没有iterrows
):
df = pd.DataFrame({'date': pd.date_range('2020-03-08', '2020-03-14')})
df['group'] = pd.to_datetime(df['date']) <= pd.to_datetime('2020-03-11')
df['month'] = df['date'].dt.month
df
Output:
date group month
0 2020-03-08 True 3
1 2020-03-09 True 3
2 2020-03-10 True 3
3 2020-03-11 True 3
4 2020-03-12 False 3
5 2020-03-13 False 3
6 2020-03-14 False 3
快得多。 谢谢。 最后我使用了:
df['group'] = tqdm(pd.to_datetime(df['date']) >= pd.to_datetime('2020-03-11'))
df.loc[df['group'] == True, 'group'] = "After"
df.loc[df['group'] == False, 'group'] = "Before"
df['month'] = pd.to_datetime(df['date']).dt.month
ans=[y for x, y in df.groupby('group', as_index=False)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.