簡體   English   中英

計算一個 pandas 滾動偏移的實際持續時間 window

[英]Calculate the actual duration of a pandas rolling offset window

Pandas有一個rolling() function對Series的windows和DataFrame對象進行計算。 如果索引是日期時間(或者您使用on參數引用日期時間列),則可以在偏移量(例如 2 秒或 7 天)上執行rolling()

我想計算每個 window 的實際持續時間,而不是偏移量。 我能想到的最好的方法是復制時間戳列,將一個設置為索引,然后使用rolling()獲取最小值和最大值。 但是,調用rolling()后新的 Timestamp 列被刪除。

import pandas as pd

df = pd.DataFrame({'B': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
                  'Tm': [pd.Timestamp('20130101 09:00:00'),
                           pd.Timestamp('20130101 09:00:02'),
                           pd.Timestamp('20130101 09:00:03'),
                           pd.Timestamp('20130101 09:00:05'),
                           pd.Timestamp('20130101 09:00:06'),
                           pd.Timestamp('20130101 09:00:10'),
                           pd.Timestamp('20130101 09:00:12'),
                           pd.Timestamp('20130101 09:00:16'),
                           pd.Timestamp('20130101 09:00:19'),
                           pd.Timestamp('20130101 09:00:20')]})

df['t'] = df['Tm']
print(df)
max_times = df.rolling('5s', on='Tm').max()
min_times = df.rolling('5s', on='Tm').min()
print(max_times)
print((max_times - min_times).astype('timedelta64[s]'))

Output:

   B                  Tm                   t
0  0 2013-01-01 09:00:00 2013-01-01 09:00:00
1  1 2013-01-01 09:00:02 2013-01-01 09:00:02
2  2 2013-01-01 09:00:03 2013-01-01 09:00:03
3  3 2013-01-01 09:00:05 2013-01-01 09:00:05
4  4 2013-01-01 09:00:06 2013-01-01 09:00:06
5  5 2013-01-01 09:00:10 2013-01-01 09:00:10
6  6 2013-01-01 09:00:12 2013-01-01 09:00:12
7  7 2013-01-01 09:00:16 2013-01-01 09:00:16
8  8 2013-01-01 09:00:19 2013-01-01 09:00:19
9  9 2013-01-01 09:00:20 2013-01-01 09:00:20
     B                  Tm
0  0.0 2013-01-01 09:00:00
1  1.0 2013-01-01 09:00:02
2  2.0 2013-01-01 09:00:03
3  3.0 2013-01-01 09:00:05
4  4.0 2013-01-01 09:00:06
5  5.0 2013-01-01 09:00:10
6  6.0 2013-01-01 09:00:12
7  7.0 2013-01-01 09:00:16
8  8.0 2013-01-01 09:00:19
9  9.0 2013-01-01 09:00:20
         B   Tm
0 00:00:00  0.0
1 00:00:01  0.0
2 00:00:02  0.0
3 00:00:02  0.0
4 00:00:03  0.0
5 00:00:01  0.0
6 00:00:01  0.0
7 00:00:01  0.0
8 00:00:01  0.0
9 00:00:02  0.0

肯定有更優雅(和實用)的技術嗎?

我通過以下方式實現了這一點:

  • 將時間戳列設置為索引,
  • 定義一個 function,它接受 DataFrame(在本例中,來自rolling()函數的片段),將索引轉換為 integer,並返回索引數組的最小值和最大值之間的差值,
  • 在 DataFrame 上調用rolling()並使用apply() function,它允許您指定要使用的 function。

apply() function 的文檔在這里: https://pandas.pydata.org/docs/reference/api/pandas.core.window.rolling.Rolling.apply.html

例子:

import pandas as pd

df = pd.DataFrame({'B': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                  'Tm': [pd.Timestamp('20130101 09:00:00'),
                         pd.Timestamp('20130101 09:00:02'),
                         pd.Timestamp('20130101 09:00:03'),
                         pd.Timestamp('20130101 09:00:05'),
                         pd.Timestamp('20130101 09:00:06'),
                         pd.Timestamp('20130101 09:00:10'),
                         pd.Timestamp('20130101 09:00:12'),
                         pd.Timestamp('20130101 09:00:16'),
                         pd.Timestamp('20130101 09:00:19'),
                         pd.Timestamp('20130101 09:00:20')]})

def duration(X):
    ind = pd.to_numeric(X.index) * 10**-9 # Convert from nanoseconds to seconds. 
    return ind.max() - ind.min()

df = df.set_index("Tm")
print(df)
durations = df.rolling("5s").apply(duration) 
df.reset_index()
print(durations)

Output:

                     B
Tm                    
2013-01-01 09:00:00  0
2013-01-01 09:00:02  0
2013-01-01 09:00:03  0
2013-01-01 09:00:05  0
2013-01-01 09:00:06  0
2013-01-01 09:00:10  0
2013-01-01 09:00:12  0
2013-01-01 09:00:16  0
2013-01-01 09:00:19  0
2013-01-01 09:00:20  0
                       B
Tm                      
2013-01-01 09:00:00  0.0
2013-01-01 09:00:02  2.0
2013-01-01 09:00:03  3.0
2013-01-01 09:00:05  3.0
2013-01-01 09:00:06  4.0
2013-01-01 09:00:10  4.0
2013-01-01 09:00:12  2.0
2013-01-01 09:00:16  4.0
2013-01-01 09:00:19  3.0
2013-01-01 09:00:20  4.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM