简体   繁体   English

在日期时间索引的熊猫数据框中选择固定时间间隔的行

[英]Choose rows a fixed time-interval apart in Datetime-indexed pandas dataframe

I have a pandas dataframe indexed by DateTime from hour "00:00:00" until hour "23:59:00" (increments by minute, seconds not counted).我有一个由 DateTime 索引的熊猫数据框,从小时“​​00:00:00”到小时“23:59:00”(按分钟递增,不计算秒数)。

in: df.index
out: DatetimeIndex(['2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           '2018-10-08 00:00:00', '2018-10-08 00:00:00',
           ...
           '2018-10-08 23:59:00', '2018-10-08 23:59:00',
           '2018-10-08 23:59:00', '2018-10-08 23:59:00',
           '2018-10-08 23:59:00', '2018-10-08 23:59:00',
           '2018-10-08 05:16:00', '2018-10-08 07:08:00',
           '2018-10-08 13:58:00', '2018-10-08 09:30:00'],
          dtype='datetime64[ns]', name='DateTime', length=91846, freq=None)

Now I want to choose specific intervals, say every 1 minute, or every 1 hour, starting from "00:00:00" and retrieve all the rows that interval apart consecutively.现在我想选择特定的间隔,比如每 1 分钟或每 1 小时,从“00:00:00”开始,并连续检索间隔开的所有行。

I can grab entire intervals, say the first hour interval, with我可以抓住整个时间间隔,比如说第一个小时的时间间隔,

df.between_time("01:00:00","00:00:00")

But I want to be able to但我希望能够

(a) get only all the times that are a specific intervals apart (b) get all the 1-hour intervals without having to manually ask for them 24 times. (a) 仅获取相隔特定时间间隔的所有时间 (b) 获取所有 1 小时的时间间隔,而无需手动询问 24 次。 How do I increment the DatetimeIndex inside the between_time command?如何在 between_time 命令中增加 DatetimeIndex? Is there a better way than that?还有比这更好的方法吗?

I would solve this problem with masking rather than making new dataframes.我会用屏蔽而不是制作新的数据框来解决这个问题。 For example you can add a column df['which_one'] and set different numbers for each subset.例如,您可以添加一列df['which_one']并为每个子集设置不同的数字。 Then you can access the subset by calling df[df['which_one']==x] where x is the subset you want to select.然后您可以通过调用df[df['which_one']==x]来访问子集,其中x是您要选择的子集。 You can still do other conditional statements and just about everything else that Pandas had to offer by access the data this way.通过这种方式访问​​数据,您仍然可以执行其他条件语句以及 Pandas 必须提供的几乎所有其他内容。

PS There are other methods to access data that might be faster. PS 还有其他方法可以更快地访问数据。 I just used what I'm most comfortable with another way would be df[df['which_one'].eq(x)] .我只是使用了我最喜欢的另一种方式是df[df['which_one'].eq(x)]

If you are deadset on dataframes I would suggest doing so with a dictionary of dataframes such as:如果您对数据帧感到厌烦,我建议您使用数据帧字典进行操作,例如:

import pandas as pd

dfdict={}

for i in range(0,10):
    dfdict[i]=pd.DataFrame()

print(dfdict)

as you will see they are indeed dfs正如您将看到的,它们确实是 dfs

out[1]
{0: Empty DataFrame
Columns: []
Index: [], 1: Empty DataFrame
Columns: []
Index: [], 2: Empty DataFrame
Columns: []
Index: [], 3: Empty DataFrame
Columns: []
Index: [], 4: Empty DataFrame
Columns: []
Index: [], 5: Empty DataFrame
Columns: []
Index: [], 6: Empty DataFrame
Columns: []
Index: [], 7: Empty DataFrame
Columns: []
Index: [], 8: Empty DataFrame
Columns: []
Index: [], 9: Empty DataFrame
Columns: []
Index: []}

Although as others have suggested there might be a more practical approach to solve your problem (difficult to say without more specifics of the issue)尽管正如其他人所建议的那样,可能有更实用的方法来解决您的问题(如果没有更具体的问题很难说)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 合并两个以日期时间为索引的pandas.dataframe对象 - Merging two datetime-indexed pandas.dataframe objects 根据递归时间间隔过滤DataFrame行中的异常值 - Filter outliers in DataFrame rows based on a recursive time-interval 如何将零值添加到以日期时间为索引的 Pandas 数据框,例如用于后续绘图 - How to add zero values to datetime-indexed Pandas dataframe, e.g. for subsequent graphing 熊猫:日期时间索引系列到时间索引日期列的数据框 - Pandas: datetime indexed series to time indexed date columns dataframe 计算带有日期时间行的 pandas DataFrame 的活动间隔 - Calculate activity interval for a pandas DataFrame with datetime rows 具有不等元素的 Pandas 日期时间索引 DataFrame 之间的操作 - Operation between pandas datetime-indexed DataFrames with non-equal elements Pythonic方法延迟datetime-indexed列 - Pythonic way to lag datetime-indexed columns 仅查找dataframe列的平均值直到日期结束(datetime-indexed) - Finding the average of dataframe column only until end of day (datetime-indexed) 从时间索引的pandas数据帧中删除夏令时行 - Deleting rows of daylight saving time from a time indexed pandas dataframe 如何使用Python Pandas从时间间隔数据计算时间点值? - How to compute point-in-time values from time-interval data with Python Pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM