簡體   English   中英

非固定滾動窗口

[英]non fixed rolling window

我希望在列表上實現一個滾動窗口,但不是固定長度的窗口,我想提供一個滾動窗口列表:
像這樣的東西:

l1 = [5, 3, 8, 2, 10, 12, 13, 15, 22, 28]
l2 = [1, 2, 2, 2, 3, 4, 2, 3, 5, 3]
get_custom_roling( l1, l2, np.average)

結果將是:

[5, 4, 5.5, 5, 6.67, ....]

6.67 計算為 3 個元素 10、2、8 的平均值。

我實施了一個緩慢的解決方案,歡迎每個想法使它更快:):

import numpy as np



def get_the_list(end_point, number_points):
   """ 
   example: get_the_list(6, 3) ==> [4, 5, 6]
   example: get_the_list(9, 5) ==> [5, 6, 7, 8, 9]
   """
    if np.isnan(number_points):
        return []
    number_points = int( number_points)
    return list(range(end_point, end_point - number_points, -1  ))

def get_idx(s):
    ss = list(enumerate(s) )
    sss = (get_the_list(*elem)  for elem in ss  )
    return sss

def get_custom_roling(s, ss, funct):
    output_get_idx = get_idx(ss)
    agg_stuff = [s[elem] for elem in output_get_idx]
    res_agg_stuff = [ funct(elem) for elem in agg_stuff   ]
    res_agg_stuff = eiu.pd.Series(data=res_agg_stuff, index = s.index)
    return res_agg_stuff

Pandas 自定義窗口滾動允許您修改窗口大小。

簡單解釋: startend數組保存索引值以制作數據切片。

#start = [0  0  1  2  2  2  5  5  4  7]
#end =   [1  2  3  4  5  6  7  8  9 10]

傳遞給get_window_bounds參數由 BaseIndexer 給出。

import pandas as pd
import numpy as np
from pandas.api.indexers import BaseIndexer
from typing import Optional, Tuple


class CustomIndexer(BaseIndexer):

    def get_window_bounds(self,
                          num_values: int = 0,
                          min_periods: Optional[int] = None,
                          center: Optional[bool] = None,
                          closed: Optional[str] = None
                          ) -> Tuple[np.ndarray, np.ndarray]:

        end = np.arange(1, num_values+1, dtype=np.int64)
        start = end - np.array(self.custom_name_whatever, dtype=np.int64)
        return start, end
df = pd.DataFrame({"l1": [5, 3, 8, 2, 10, 12, 13, 15, 22, 28],
                   "l2": [1, 2, 2, 2,  3,  4,  2,  3,  5,  3]})

indexer = CustomIndexer(custom_name_whatever=df.l2)

df["variable_mean"] = df.l1.rolling(indexer).mean()

print(df)

輸出:

   l1  l2  variable_mean
0   5   1       5.000000
1   3   2       4.000000
2   8   2       5.500000
3   2   2       5.000000
4  10   3       6.666667
5  12   4       8.000000
6  13   2      12.500000
7  15   3      13.333333
8  22   5      14.400000
9  28   3      21.666667

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM