简体   繁体   English

从 df.rolling 返回错误的 window

[英]Getting the wrong window back from df.rolling

I'm converting a series of data from 1 minute intervals to 5 minute intervals.我正在将一系列数据从 1 分钟间隔转换为 5 分钟间隔。 To do this I am using the rolling and sum funcitons from pandas then attempting to slice in steps of 5. This makes sense to me, sum everything up, then take the rows that have the information I want, which is every 5th row.为此,我使用 pandas 中的滚动和求和函数,然后尝试以 5 为步进行切片。这对我来说很有意义,总结所有内容,然后获取具有我想要的信息的行,即每 5 行。

However my code is not slicing how I intended and is instead ignoring row 0 for the operations.但是,我的代码并没有按照我的意图进行切片,而是忽略了操作的第 0 行。 In the attached picture the Left column is the sliced code and the right column is the unsliced code.在附图中,左列是切片代码,右列是未切片代码。 As the Picture of Results shows, the first slice happens at row 5 instead of 4 (the 5th data entry).结果图片所示,第一个切片出现在第 5 行而不是第 4 行(第 5 个数据条目)。 I am geting the right length back in my slice, but clearly not getting all of the data I need.我在切片中得到了正确的长度,但显然没有得到我需要的所有数据。 My code is below with tmpL on the Left and tmpR on the right.我的代码在下面,左边是 tmpL,右边是 tmpR。 I am expecting to get rows 4 and 9 back instead of rows 5 and 10. What is the proper notation I should be using?我期望得到第 4 行和第 9 行而不是第 5 行和第 10 行。我应该使用什么正确的符号?

tmpL = df.rolling(window=5).sum()[::5]
tmpR = df.rolling(window=5).sum()

If this was me I would take the following approach:如果这是我,我会采取以下方法:

tmp_df = df.rolling(window=5).sum()
df_5 = tmp_df[(tmp_df.index % 5) == 4]

df_5 will be your desired output df_5将是您想要的 output

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM