[英]Detrending a time-series of a multi-dimensional array without the for loops
I have a 3D array which has a time-series of air-sea carbon flux for each grid point on the earth's surface (model output). 我有一个3D阵列,该阵列具有地球表面上每个网格点的海气碳通量时间序列(模型输出)。 I want to remove the trend (linear) in the time series.
我想删除时间序列中的趋势(线性)。 I came across this code:
我碰到了这段代码:
from matplotlib import mlab
for x in xrange(40):
for y in xrange(182):
cflux_detrended[:, x, y] = mlab.detrend_linear(cflux[:, x, y])
Can I speed this up by not using for loops? 我可以通过不使用for循环来加快速度吗?
Scipy has a lot of signal processing tools. Scipy有很多信号处理工具。 Using
scipy.signal.detrend()
will remove the linear trend along an axis of the data. 使用
scipy.signal.detrend()
将删除沿数据轴的线性趋势。 From the documentation it looks like the linear trend of the complete data set will be subtracted from the time-series at each grid point. 从文档中看来,整个数据集的线性趋势似乎将从每个网格点的时间序列中减去。
import scipy.signal
cflux_detrended = scipy.signal.detrend(cflux, axis=0)
Using scipy.signal
will get the same result as using the method in the original post. 使用
scipy.signal
将获得与使用原始帖子中的方法相同的结果。 Using Josef's detrend_separate()
function will also return the same result. 使用Josef的
detrend_separate()
函数也将返回相同的结果。
Here are two versions using numpy.linalg.lstsq. 这是使用numpy.linalg.lstsq的两个版本。 This version uses np.vander to create any polynomial trend.
此版本使用np.vander创建任何多项式趋势。
Warning: not tested except on the example. 警告:除示例外,未经测试。
I think something like this will be added to scikits.statsmodels, which doesn't have yet a multivariate version for detrending either. 我认为这样的事情将被添加到scikits.statsmodels中,后者也没有用于降趋势的多元版本。 For the common trend case, we could use scikits.statsmodels OLS and we would also get all the result statistics for the estimation.
对于常见趋势情况,我们可以使用scikits.statsmodels OLS,我们还可以获取所有结果统计信息以进行估计。
# -*- coding: utf-8 -*-
"""Detrending multivariate array
Created on Fri Dec 02 15:08:42 2011
Author: Josef Perktold
http://stackoverflow.com/questions/8355197/detrending-a-time-series-of-a-multi-dimensional-array-without-the-for-loops
I should also add the multivariate version to statsmodels
"""
import numpy as np
import matplotlib.pyplot as plt
def detrend_common(y, order=1):
'''detrend multivariate series by common trend
Paramters
---------
y : ndarray
data, can be 1d or nd. if ndim is greater then 1, then observations
are along zero axis
order : int
degree of polynomial trend, 1 is linear, 0 is constant
Returns
-------
y_detrended : ndarray
detrended data in same shape as original
'''
nobs = y.shape[0]
shape = y.shape
y_ = y.ravel()
nobs_ = len(y_)
t = np.repeat(np.arange(nobs), nobs_ /float(nobs))
exog = np.vander(t, order+1)
params = np.linalg.lstsq(exog, y_)[0]
fittedvalues = np.dot(exog, params)
resid = (y_ - fittedvalues).reshape(*shape)
return resid, params
def detrend_separate(y, order=1):
'''detrend multivariate series by series specific trends
Paramters
---------
y : ndarray
data, can be 1d or nd. if ndim is greater then 1, then observations
are along zero axis
order : int
degree of polynomial trend, 1 is linear, 0 is constant
Returns
-------
y_detrended : ndarray
detrended data in same shape as original
'''
nobs = y.shape[0]
shape = y.shape
y_ = y.reshape(nobs, -1)
kvars_ = len(y_)
t = np.arange(nobs)
exog = np.vander(t, order+1)
params = np.linalg.lstsq(exog, y_)[0]
fittedvalues = np.dot(exog, params)
resid = (y_ - fittedvalues).reshape(*shape)
return resid, params
nobs = 30
sige = 0.1
y0 = 0.5 * np.random.randn(nobs,4,3)
t = np.arange(nobs)
y_observed = y0 + t[:,None,None]
for detrend_func, name in zip([detrend_common, detrend_separate],
['common', 'separate']):
y_detrended, params = detrend_func(y_observed, order=1)
print '\n\n', name
print 'params for detrending'
print params
print 'std of detrended', y_detrended.std() #should be roughly sig=0.5 (var of y0)
print 'maxabs', np.max(np.abs(y_detrended - y0))
print 'observed'
print y_observed[-1]
print 'detrended'
print y_detrended[-1]
print 'original "true"'
print y0[-1]
plt.figure()
for i in range(4):
for j in range(3):
plt.plot(y0[:,i,j], 'bo', alpha=0.75)
plt.plot(y_detrended[:,i,j], 'ro', alpha=0.75)
plt.title(name + ' detrending: blue - original, red - detrended')
plt.show()
Since Nicholas pointed out scipy.signal.detrend. 自从尼古拉斯指出scipy.signal.detrend。 My detrend separate is basically the same as scipy.signal.detrend with fewer (no axis or breaks) or different (with polynomial order) options.
我的分离趋势与scipy.signal.detrend基本相同,具有更少的选择(无轴或中断)或不同的选择(具有多项式顺序)。
>>> res = signal.detrend(y_observed, axis=0)
>>> (res - y0).var()
0.016931858083279336
>>> (y_detrended - y0).var()
0.01693185808327945
>>> (res - y_detrended).var()
8.402584948582852e-30
我认为对旧列表的理解最简单:
cflux_detrended = np.array([[mlab.detrend_linear(t) for t in kk] for kk in cflux.T])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.