简体   繁体   English

在没有for循环的情况下对多维数组的时间序列进行趋势除

[英]Detrending a time-series of a multi-dimensional array without the for loops

I have a 3D array which has a time-series of air-sea carbon flux for each grid point on the earth's surface (model output). 我有一个3D阵列,该阵列具有地球表面上每个网格点的海气碳通量时间序列(模型输出)。 I want to remove the trend (linear) in the time series. 我想删除时间序列中的趋势(线性)。 I came across this code: 我碰到了这段代码:

from matplotlib import mlab

for x in xrange(40):
    for y in xrange(182):
        cflux_detrended[:, x, y] = mlab.detrend_linear(cflux[:, x, y])

Can I speed this up by not using for loops? 我可以通过不使用for循环来加快速度吗?

Scipy has a lot of signal processing tools. Scipy有很多信号处理工具。 Using scipy.signal.detrend() will remove the linear trend along an axis of the data. 使用scipy.signal.detrend()将删除沿数据轴的线性趋势。 From the documentation it looks like the linear trend of the complete data set will be subtracted from the time-series at each grid point. 从文档中看来,整个数据集的线性趋势似乎将从每个网格点的时间序列中减去。

import scipy.signal
cflux_detrended = scipy.signal.detrend(cflux, axis=0)

Using scipy.signal will get the same result as using the method in the original post. 使用scipy.signal将获得与使用原始帖子中的方法相同的结果。 Using Josef's detrend_separate() function will also return the same result. 使用Josef的detrend_separate()函数也将返回相同的结果。

去趋势流

Here are two versions using numpy.linalg.lstsq. 这是使用numpy.linalg.lstsq的两个版本。 This version uses np.vander to create any polynomial trend. 此版本使用np.vander创建任何多项式趋势。

Warning: not tested except on the example. 警告:除示例外,未经测试。

I think something like this will be added to scikits.statsmodels, which doesn't have yet a multivariate version for detrending either. 我认为这样的事情将被添加到scikits.statsmodels中,后者也没有用于降趋势的多元版本。 For the common trend case, we could use scikits.statsmodels OLS and we would also get all the result statistics for the estimation. 对于常见趋势情况,我们可以使用scikits.statsmodels OLS,我们还可以获取所有结果统计信息以进行估计。

# -*- coding: utf-8 -*-
"""Detrending multivariate array

Created on Fri Dec 02 15:08:42 2011

Author: Josef Perktold

http://stackoverflow.com/questions/8355197/detrending-a-time-series-of-a-multi-dimensional-array-without-the-for-loops

I should also add the multivariate version to statsmodels

"""

import numpy as np

import matplotlib.pyplot as plt


def detrend_common(y, order=1):
    '''detrend multivariate series by common trend

    Paramters
    ---------
    y : ndarray
       data, can be 1d or nd. if ndim is greater then 1, then observations
       are along zero axis
    order : int
       degree of polynomial trend, 1 is linear, 0 is constant

    Returns
    -------
    y_detrended : ndarray
       detrended data in same shape as original 

    '''
    nobs = y.shape[0]
    shape = y.shape
    y_ = y.ravel()
    nobs_ = len(y_)
    t = np.repeat(np.arange(nobs), nobs_ /float(nobs))
    exog = np.vander(t, order+1)
    params = np.linalg.lstsq(exog, y_)[0]
    fittedvalues = np.dot(exog, params)
    resid = (y_ - fittedvalues).reshape(*shape)
    return resid, params

def detrend_separate(y, order=1):
    '''detrend multivariate series by series specific trends

    Paramters
    ---------
    y : ndarray
       data, can be 1d or nd. if ndim is greater then 1, then observations
       are along zero axis
    order : int
       degree of polynomial trend, 1 is linear, 0 is constant

    Returns
    -------
    y_detrended : ndarray
       detrended data in same shape as original 

    '''
    nobs = y.shape[0]
    shape = y.shape
    y_ = y.reshape(nobs, -1)
    kvars_ = len(y_)
    t = np.arange(nobs)
    exog = np.vander(t, order+1)
    params = np.linalg.lstsq(exog, y_)[0]
    fittedvalues = np.dot(exog, params)
    resid = (y_ - fittedvalues).reshape(*shape)
    return resid, params

nobs = 30
sige = 0.1
y0 = 0.5 * np.random.randn(nobs,4,3)
t = np.arange(nobs)
y_observed = y0 + t[:,None,None]

for detrend_func, name in zip([detrend_common, detrend_separate], 
                               ['common', 'separate']):
    y_detrended, params = detrend_func(y_observed, order=1)
    print '\n\n', name 
    print 'params for detrending'
    print params
    print 'std of detrended', y_detrended.std()  #should be roughly sig=0.5 (var of y0)
    print 'maxabs', np.max(np.abs(y_detrended - y0))

    print 'observed'
    print y_observed[-1]
    print 'detrended'
    print y_detrended[-1]
    print 'original "true"'
    print y0[-1]

    plt.figure()
    for i in range(4):
        for j in range(3):
            plt.plot(y0[:,i,j], 'bo', alpha=0.75)
            plt.plot(y_detrended[:,i,j], 'ro', alpha=0.75)
    plt.title(name + ' detrending: blue - original, red - detrended')


plt.show()

Since Nicholas pointed out scipy.signal.detrend. 自从尼古拉斯指出scipy.signal.detrend。 My detrend separate is basically the same as scipy.signal.detrend with fewer (no axis or breaks) or different (with polynomial order) options. 我的分离趋势与scipy.signal.detrend基本相同,具有更少的选择(无轴或中断)或不同的选择(具有多项式顺序)。

>>> res = signal.detrend(y_observed, axis=0)
>>> (res - y0).var()
0.016931858083279336
>>> (y_detrended - y0).var()
0.01693185808327945
>>> (res - y_detrended).var()
8.402584948582852e-30

我认为对旧列表的理解最简单:

cflux_detrended = np.array([[mlab.detrend_linear(t) for t in kk] for kk in cflux.T])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM