简体   繁体   English

计算不同时间序列的相关性

[英]Calculating correlation of different time series

I have several time series, ie I have measured a couple of signals over 15min. 我有几个时间序列,即我在15分钟内测量了几个信号。 Each signal is sampled several times each second but the timestamps of the different signals are not equal. 每个信号每秒采样几次,但不同信号的时间戳不相等。 Let's say we start at time 0s. 假设我们从时间0开始。 For example, signal one has the following (timestamp, values): 例如,信号一具有以下(时间戳,值):

0.1s: 954
0.2s: 1000
0.24s: 1090
0.3s: 855
0.45s: 600
... 

Signal two has the following (timestamp, values): 信号二具有以下内容(时间戳,值):

0.05s: 900
0.13s: 960
0.2s: 1000
0.29s: 850 
0.33s 800
...

How can I now calculate the correlation of the values of these time series in eg python or Matlab? 现在如何在python或Matlab中计算这些时间序列的值的相关性? If the values would be always at the same timestamps I could calculate just the correlation between the individual values but unfortunately the values are not at the same timestamps. 如果这些值将始终处于相同的时间戳,则我可以仅计算各个值之间的相关性,但不幸的是,这些值不在同一时间戳上。

Let's say you have a signal with values in an array s1 at time points t1 , and a signal s2 evaluate at time points t2 . 假设您在时间点t1一个数组s1中的值的信号,并且在时间点t2处有一个信号s2求值。 With NumPy in Python: 在Python中使用NumPy

  1. Select a common set of time points for both signals t . 为两个信号t选择一组公共的时间点。 You can pick t1 or t2 , or compute a linear space in the considered time range with np.linspace . 您可以选择t1t2 ,或者使用np.linspace在考虑的时间范围内计算线性空间。 In any case, I'd make sure that the minimum and maximum values of t are in the range of both t1 and t2 to avoid extrapolations. 无论如何,我都会确保t的最小值和最大值在t1t2的范围内,以避免外推。
  2. Compute interpolations for both signals, s1interp and s2interp . 计算两个信号s1interps2interp This can be done with np.interp , which computes linear interpolations. 可以使用np.interp来完成,它可以计算线性插值。 If you need more sophisticated interpolation methods, you can take a look at SciPy's interp1d . 如果您需要更复杂的插值方法,可以看看SciPy的interp1d
  3. Compute the correlation between s1interp and s2interp . 计算s1interps2interp之间的相关性。 This is done with np.corrcoef . 这是通过np.corrcoef完成的。

You could do some simple interpolation (see interp1 for MATLAB) on one of the data sets so that they share a sampling rate, if that's your only issue... 您可以对其中一个数据集进行一些简单的插值(请参见MATLAB的interp1 ),以使它们共享采样率,如果那是您唯一的问题...

X =[0.1   954
    0.2   1000
    0.24  1090
    0.3   855
    0.45  600];

Y =[0.05  900
    0.13  960
    0.2   1000
    0.29  850 
    0.33  800];

t = Y(:,1); % get time samples from Y
% Interpolate (linearly, with extrapolation) X2 values onto time samples t
X2 = [t, interp1(X(:,1), X(:,2), t, 'linear', 'extrap')];

>> X2 = [0.05  931
         0.13  967.8
         0.2   1000
         0.29  894.1667
         0.33  804];

Now they have the same sample points, you can do what you like. 现在它们具有相同的样本点,您可以做自己喜欢的事。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM