如何在Python中聚合時間序列？

Question

我有兩個不同的時間序列，部分重疊的時間戳：

import scikits.timeseries as ts
from datetime import datetime 
a = ts.time_series([1,2,3], dates=[datetime(2010,10,20), datetime(2010,10,21), datetime(2010,10,23)], freq='D')
b = ts.time_series([4,5,6], dates=[datetime(2010,10,20), datetime(2010,10,22), datetime(2010,10,23)], freq='D')

代表以下數據：

Day:   20. 21. 22. 23.
  a:    1   2   -   3
  b:    4   -   5   6

我想用系數a（0.3）和b（0.7）計算每天的加權平均值，同時忽略缺失值：

Day 20.: (0.3 * 1 + 0.7 * 4) / (0.3 + 0.7) = 3.1 / 1.  = 3.1
Day 21.: (0.3 * 2          ) / (0.3      ) = 0.6 / 0.3 = 2
Day 22.: (          0.7 * 5) / (      0.7) = 3.5 / 0.7 = 5
Day 23.: (0.3 * 3 + 0.7 * 6) / (0.3 + 0.7) = 3.1 / 1.  = 5.1

當我第一次嘗試對齊這些時間序列時：

a1, b1 = ts.aligned(a, b)

我得到了正確的蒙面時間序列：

timeseries([1 2 -- 3],
  dates = [20-Oct-2010 ... 23-Oct-2010],
  freq  = D)

timeseries([4 -- 5 6],
  dates = [20-Oct-2010 ... 23-Oct-2010],
  freq  = D)

但當我執行a1 * 0.3 + b1 * 0.7 ，它會忽略僅存在於一個時間序列中的值：

timeseries([3.1 -- -- 5.1],
   dates = [20-Oct-2010 ... 23-Oct-2010],
   freq  = D)

我該怎么做才能收到期待的？

timeseries([3.1 2. 5. 5.1],
   dates = [20-Oct-2010 ... 23-Oct-2010],
   freq  = D)

編輯：答案也應適用於兩個以上具有不同權重和不同缺失值的初始時間序列。

因此，如果我們有四個時間序列，其權重為T1（0.1），T2（0.2），T3（0.3）和T4（0.4），則它們在給定時間戳下的權重將為：

            |  T1 |  T2 |  T3 |  T4 |
weight      | 0.1 | 0.2 | 0.3 | 0.4 |
-------------------------------------
all present | 10% | 20% | 30% | 40% |
T1 missing  |     | 22% | 33% | 45% |
T1,T2 miss. |     |     | 43% | 57% |
T4 missing  | 17% | 33% | 50% |     |
etc.

Answer 1

我試過並發現了這個：

aWgt = 0.3
bWgt = 0.7

print (np.where(a1.mask, 0., a1.data * aWgt) +
       np.where(b1.mask, 0., b1.data * bWgt)) / (np.where(a1.mask, 0., aWgt) +
                                                 np.where(b1.mask, 0., bWgt))

# array([ 3.1,  2. ,  5. ,  5.1])

這適用於具有多個初始時間序列的已編輯問題。 但希望有人會發現更好。

編輯：這是我的功能：

def weightedAvg(weightedTimeseries):
    sumA = np.sum((np.where(ts.mask, 0., ts.data * weight) for ts, weight in weightedTimeseries), axis=0)
    sumB = np.sum((np.where(ts.mask, 0., weight) for ts, weight in weightedTimeseries), axis=0)
    return np.divide(sumA, sumB)

weightedAvg(((a1, 0.3), (bb, 0.7)))
# array([ 3.1,  2. ,  5. ,  5.1])

可用於任何數量的時間序列;-)

如何在Python中聚合時間序列？

問題描述

1 個解決方案

解決方案1
3 已采納 2010-10-20 13:09:21

如何在Python中聚合時間序列？

問題描述

1 個解決方案

解決方案1 3 已采納 2010-10-20 13:09:21

解決方案1
3 已采納 2010-10-20 13:09:21