在 dataframe 中循环，日期和日期之间在 python

Question

I would like to process all the data between two dates changing the dates.我想处理更改日期的两个日期之间的所有数据。 In particular, I have the following dataframe:特别是，我有以下dataframe：

                   real    model2      model1
date                                               
2017-01-01 00:00:00   51.22   52.776425   52.583711
2017-01-01 01:00:00   53.00   47.211506   50.679937
2017-01-01 02:00:00   52.00   44.722529   48.478772
2017-01-01 03:00:00   51.00   42.475170   45.141708
2017-01-01 04:00:00   47.27   38.862827   44.583250
2017-01-01 05:00:00   45.49   39.473972   44.930338
2017-01-01 06:00:00   45.69   42.465659   47.380179

where dates are also indexes.其中日期也是索引。 I would like to collect all the data day by day in a list to pass to a function.我想每天在一个列表中收集所有数据以传递给 function。 I have done it in a not smart\correct way as:我以一种不聪明\正确的方式完成了它：

for iday in range(1,9):
   #
   #
   start_date = '2017-01-0'+str(iday)+ ' 00:00:00'
   end_date   = '2017-01-0'+str(iday)+ ' 23:00:00'
   #
   data_sub_e = EE.loc[start_date:end_date]

It sounds not correct, it is difficult to extend to a number of day greater then 10 and it seems to not use pandas feature.听起来不正确，很难扩展到大于 10 的天数，而且似乎不使用 pandas 功能。

Is there any smart way to do that?有什么聪明的方法可以做到这一点吗？

Thanks in advance,提前致谢，

Diego迭戈

Answer 1

I assume that date is of datetime type (not string ).我假设date是datetime类型（不是string ）。

Using df.index.date you can select rows by the date part only .使用df.index.date您可以仅按日期部分select 行。

Eg:例如：

d1 = pd.to_datetime('2017-01-01')  # The criterion date
df[df.index.date == d1]   # Get all rows from this date, whatever the hour part

Another hint : Instead of your loop based on the day number:另一个提示：而不是基于天数的循环：

for iday in range(1,9):

run a loop based on pd.date_range , something like:运行基于pd.date_range的循环，例如：

for dat in pd.date_range('2017-01-01', '2017-01-15', freq='D'):

Of course, set the end date according to your needs.当然，根据您的需要设置结束日期。

Another choice can be to group your DataFrame by the date part of the index:另一种选择是按索引的日期部分对 DataFrame 进行分组：

df.groupby(pd.Grouper(freq='D'))

and then apply your function to each group.然后将您的 function 应用于每个组。

Edit following the comment按照评论编辑

To change your values into lists, for each group, you can use named aggregation :要将您的值更改为列表，对于每个组，您可以使用命名聚合：

df.groupby(pd.Grouper(freq='D')).agg({'real': list,
    'model1': list, 'model2': list})

If you want to assign own column names, you can use another syntax, with named parameters:如果要分配自己的列名，可以使用另一种语法，带有命名参数：

df.groupby(pd.Grouper(freq='D')).agg(Real=('real', list),
    Model_1=('model1', list), Model_2=('model2', list))

Here parameter names specify output column names.此处参数名称指定 output 列名称。 The value of each parameter is a tuple: ( original column name , aggregation function ).每个参数的值是一个元组：（原始列名，聚合 function ）。

在 dataframe 中循环，日期和日期之间在 python

问题描述

1 个解决方案

解决方案1
0 2019-11-11 13:04:00

Edit following the comment按照评论编辑

在 dataframe 中循环，日期和日期之间在 python

问题描述

1 个解决方案

解决方案1 0 2019-11-11 13:04:00

Edit following the comment按照评论编辑

解决方案1
0 2019-11-11 13:04:00