The rolling window function pandas.DataFrame.rolling
of pandas 0.22 takes a window
argument that is described as follows:
window : int, or offset
Size of the moving window. This is the number of observations used for calculating the statistic. Each window will be a fixed size.
If its an offset then this will be the time period of each window. Each window will be a variable sized based on the observations included in the time-period. This is only valid for datetimelike indexes. This is new in 0.19.0
What actually is an offset in this context?
In a nutshell, if you use an offset
like "2D" (2 days), pandas will use the datetime info in the index (if available), potentially accounting for any missing rows or irregular frequencies. But if you use a simple int
like 2, then pandas will treat the index as a simple integer index [0,1,2,...] and ignore any datetime info in the index.
A simple example should make this clear:
df=pd.DataFrame({'x':range(4)},
index=pd.to_datetime(['1-1-2018','1-2-2018','1-4-2018','1-5-2018']))
x
2018-01-01 0
2018-01-02 1
2018-01-04 2
2018-01-05 3
Note that (1) the index is a datetime, but also (2) it is missing '2018-01-03'. So if you use a plain integer like 2, rolling
will just look at the last two rows, regardless of the datetime value (in a sense it's behaving like iloc[i-1:i]
where i
is the current row):
df.rolling(2).count()
x
2018-01-01 1.0
2018-01-02 2.0
2018-01-04 2.0
2018-01-05 2.0
Conversely, if you use an offset of 2 days ( '2D'
), rolling
will use the actual datetime values and accounts for any irregularities in the datetime index.
df.rolling('2D').count()
x
2018-01-01 1.0
2018-01-02 2.0
2018-01-04 1.0
2018-01-05 2.0
Also note, you need the index to be sorted in ascending order when using a date offset, but it doesn't matter when using a simple integer (since you're just ignoring the index anyway).
seems like this works only when you have a single index, doesn't seem to work with multiple index? Does anyone else have similar problems?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.