在日期时间索引的熊猫数据框中选择固定时间间隔的行

Question

I have a pandas dataframe indexed by DateTime from hour "00:00:00" until hour "23:59:00" (increments by minute, seconds not counted).我有一个由 DateTime 索引的熊猫数据框，从小时“00:00:00”到小时“23:59:00”（按分钟递增，不计算秒数）。

in: df.index
out: DatetimeIndex(['2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           ...
           '2018-10-08 23:59:00', '2018-10-08 23:59:00',
           '2018-10-08 23:59:00', '2018-10-08 23:59:00',
           '2018-10-08 23:59:00', '2018-10-08 23:59:00',
           '2018-10-08 05:16:00', '2018-10-08 07:08:00',
           '2018-10-08 13:58:00', '2018-10-08 09:30:00'],
          dtype='datetime64[ns]', name='DateTime', length=91846, freq=None)

Now I want to choose specific intervals, say every 1 minute, or every 1 hour, starting from "00:00:00" and retrieve all the rows that interval apart consecutively.现在我想选择特定的间隔，比如每 1 分钟或每 1 小时，从“00:00:00”开始，并连续检索间隔开的所有行。

I can grab entire intervals, say the first hour interval, with我可以抓住整个时间间隔，比如说第一个小时的时间间隔，

df.between_time("01:00:00","00:00:00")

But I want to be able to但我希望能够

(a) get only all the times that are a specific intervals apart (b) get all the 1-hour intervals without having to manually ask for them 24 times. (a) 仅获取相隔特定时间间隔的所有时间 (b) 获取所有 1 小时的时间间隔，而无需手动询问 24 次。 How do I increment the DatetimeIndex inside the between_time command?如何在 between_time 命令中增加 DatetimeIndex？ Is there a better way than that?还有比这更好的方法吗？

Answer 1

I would solve this problem with masking rather than making new dataframes.我会用屏蔽而不是制作新的数据框来解决这个问题。 For example you can add a column df['which_one'] and set different numbers for each subset.例如，您可以添加一列df['which_one']并为每个子集设置不同的数字。 Then you can access the subset by calling df[df['which_one']==x] where x is the subset you want to select.然后您可以通过调用df[df['which_one']==x]来访问子集，其中x是您要选择的子集。 You can still do other conditional statements and just about everything else that Pandas had to offer by access the data this way.通过这种方式访问数据，您仍然可以执行其他条件语句以及 Pandas 必须提供的几乎所有其他内容。

PS There are other methods to access data that might be faster. PS 还有其他方法可以更快地访问数据。 I just used what I'm most comfortable with another way would be df[df['which_one'].eq(x)] .我只是使用了我最喜欢的另一种方式是df[df['which_one'].eq(x)] 。

Answer 2

If you are deadset on dataframes I would suggest doing so with a dictionary of dataframes such as:如果您对数据帧感到厌烦，我建议您使用数据帧字典进行操作，例如：

import pandas as pd

dfdict={}

for i in range(0,10):
    dfdict[i]=pd.DataFrame()

print(dfdict)

as you will see they are indeed dfs正如您将看到的，它们确实是 dfs

out[1]
{0: Empty DataFrame
Columns: []
Index: [], 1: Empty DataFrame
Columns: []
Index: [], 2: Empty DataFrame
Columns: []
Index: [], 3: Empty DataFrame
Columns: []
Index: [], 4: Empty DataFrame
Columns: []
Index: [], 5: Empty DataFrame
Columns: []
Index: [], 6: Empty DataFrame
Columns: []
Index: [], 7: Empty DataFrame
Columns: []
Index: [], 8: Empty DataFrame
Columns: []
Index: [], 9: Empty DataFrame
Columns: []
Index: []}

Although as others have suggested there might be a more practical approach to solve your problem (difficult to say without more specifics of the issue)尽管正如其他人所建议的那样，可能有更实用的方法来解决您的问题（如果没有更具体的问题很难说）

在日期时间索引的熊猫数据框中选择固定时间间隔的行

问题描述

2 个解决方案

解决方案1
0 2019-02-20 18:21:42

解决方案2
0 2019-02-20 18:26:22

在日期时间索引的熊猫数据框中选择固定时间间隔的行

问题描述

2 个解决方案

解决方案1 0 2019-02-20 18:21:42

解决方案2 0 2019-02-20 18:26:22

解决方案1
0 2019-02-20 18:21:42

解决方案2
0 2019-02-20 18:26:22