简体   繁体   中英

Correlation of Two Variables in a Time Series in Python?

If I have two different data sets that are in a time series, is there a simple way to find the correlation between the two sets in python?

For example with:

# [ (dateTimeObject, y, z) ... ]
x = [ (8:00am, 12, 8), (8:10am, 15, 10) .... ]

How might I get the correlation of y and z in Python?

Little slow on the uptake here. pandas (http://github.com/wesm/pandas and pandas.sourceforge.net) is probably your best bet. I'm biased because I wrote it but:

In [7]: ts1
Out[7]: 
2000-01-03 00:00:00    -0.945653010936
2000-01-04 00:00:00    0.759529904445
2000-01-05 00:00:00    0.177646448683
2000-01-06 00:00:00    0.579750822716
2000-01-07 00:00:00    -0.0752734982291
2000-01-10 00:00:00    0.138730447557
2000-01-11 00:00:00    -0.506961851495

In [8]: ts2
Out[8]: 
2000-01-03 00:00:00    1.10436688823
2000-01-04 00:00:00    0.110075215713
2000-01-05 00:00:00    -0.372818939799
2000-01-06 00:00:00    -0.520443811368
2000-01-07 00:00:00    -0.455928700936
2000-01-10 00:00:00    1.49624355051
2000-01-11 00:00:00    -0.204383054598

In [9]: ts1.corr(ts2)
Out[9]: -0.34768587480980645

Notably if your data are over different sets of dates, it will compute the pairwise correlation. It will also automatically exclude NaN values!

Scipy has a statistics module with correlation function.

from scipy import stats
# Y and Z are numpy arrays or lists of variables 
stats.pearsonr(Y, Z)

You can do that via the covariance matrix or correlation coefficients. http://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html and http://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html are the documentation functions for this, the former also comes with a sample how to use it (corrcoef usage is very similar).

>>> x = [ (None, 12, 8), (None, 15, 10), (None, 10, 6) ]
>>> data = numpy.array([[e[1] for e in x], [e[2] for e in x]])
>>> numpy.corrcoef(data)
array([[ 1.        ,  0.99339927],
       [ 0.99339927,  1.        ]])

Use numpy:

from numpy import *
v = [ ('k', 1, 2), ('l', 2, 4), ('m', 13, 9) ]
corrcoef([ a[1] for a in v ], [ a[2] for a in v ])[0,1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM