简体   繁体   中英

Analyzing unevenly spaced timeseries

I have been tasked with analyzing the input flow in a water tank in relation to a number of weather parameters. In a narrower sense, I have to investigate any possible effect that these variables might have on the variable of interest. That being said, I don't know which method(s) to apply as I'm thinking only of Pearson's correlation coefficient. Even with this one, the sampling rate is different as the weather conditions are measured every 3 hours while input flow every 5 minutes. Should I average over 3 hours, disregard data not corresponding to weather dataset timestamp or would you suggest something else?

weather = [ (1.21,0), (1.08, 0.5), (1.04, 1), (1.02, 1.5)]
input_flow = [ (120,0), (124,1)]

A representation of such data where the first index is the value of the parameter while the second one is time in seconds

One way to achieve this: `

import numpy as np

a = np.arange(100).reshape(-1,1)
b = np.arange(10).reshape(-1,1)

#How do we -expand- make "B" a set of points the same width as "A"?

expansion_factor = a.shape[0]/b.shape[0]
b_expanded = np.repeat(b, expansion_factor, axis=0)

#How can we combine input data using A and B ?
c = np.concatenate((a, b_expanded),axis=1)

#Could this be what we want to achieve ?
c

It is possible to use sparse matrices as another way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM