[英]Averaging values with irregular time intervals
I have several pairs of arrays of measurements and the times at which the measurements were taken that I want to average.我有几对 arrays 测量值以及我想要平均的测量时间。 Unfortunately the times at which these measurements were taken isn't regular or the same for each pair.
不幸的是,每对进行这些测量的时间并不规律或相同。
My idea for averaging them is to create a new array with the value at each second then average these.我对它们进行平均的想法是创建一个具有每秒值的新数组,然后对它们进行平均。 It works but it seems a bit clumsy and means I have to create many unnecessarily long arrays.
它有效,但似乎有点笨拙,这意味着我必须创建许多不必要的长 arrays。
Example Inputs示例输入
m1 = [0.4, 0.6, 0.2]
t1 = [0.0, 2.4, 5.2]
m2 = [1.0, 1.4, 1.0]
t2 = [0.0, 3.6, 4.8]
Generated Regular Arrays for values at each second为每秒的值生成常规 Arrays
r1 = [0.4, 0.4, 0.4, 0.6, 0.6, 0.6, 0.2]
r2 = [1.0, 1.0, 1.0, 1.0, 1.4, 1.0]
Average values up to length of shortest array最长数组长度的平均值
a = [0.7, 0.7, 0.7, 0.8, 1.0, 0.8]
My attempt given list of measurement arrays measurements
and respective list of time interval arrays times
我的尝试给出了测量 arrays
measurements
的列表和相应的时间间隔列表 arrays times
def granulate(values, times):
count = 0
regular_values = []
for index, x in enumerate(times):
while count <= x:
regular_values.append(values[index])
count += 1
return np.array(regular_values)
processed_measurements = [granulate(m, t) for m, t in zip(measurements, times)]
min_length = min(len(m) for m in processed_measurements )
processed_measurements = [m[:min_length] for m in processed_measurements]
average_measurement = np.mean(processed_measurements, axis=0)
Is there a better way to do it, ideally using numpy functions?有没有更好的方法,最好使用 numpy 函数?
This will average to closest second:这将平均到最接近的秒数:
time_series = np.arange(np.stack((t1, t2)).max())
np.mean([m1[abs(t1-time_series[:,None]).argmin(axis=1)], m2[abs(t2-time_series[:,None]).argmin(axis=1)]], axis=0)
If you want to floor times to each second (with possibility of generalizing to more arrays):如果您想将时间延迟到每秒(有可能推广到更多数组):
m = [m1, m2]
t = [t1, t2]
m_t=[]
time_series = np.arange(np.stack(t).max())
for i in range(len(t)):
time_diff = time_series-t[i][:,None]
m_t.append(m[i][np.where(time_diff > 0, time_diff, np.inf).argmin(axis=0)])
average = np.mean(m_t, axis=0)
output: output:
[0.7 0.7 0.7 0.8 1. 0.8]
You can do (a bit more numpy-ish solution):您可以这样做(更多 numpy-ish 解决方案):
import numpy as np
# oddly enough - numpy doesn't have it's own ffill function:
def np_ffill(arr):
mask = np.arange(len(arr))
mask[np.isnan(arr)]=0
np.maximum.accumulate(mask, axis=0, out=mask)
return arr[mask]
t1=np.ceil(t1).astype("int")
t2=np.ceil(t2).astype("int")
r1=np.empty(max(t1)+1)
r2=np.empty(max(t2)+1)
r1[:]=np.nan
r2[:]=np.nan
r1[t1]=m1
r2[t2]=m2
r1=np_ffill(r1)
r2=np_ffill(r2)
>>> print(r1,r2)
[0.4 0.4 0.4 0.6 0.6 0.6 0.2] [1. 1. 1. 1. 1.4 1. ]
#in order to get avg:
r3=np.vstack([r1[:len(r2)],r2[:len(r1)]]).mean(axis=0)
>>> print(r3)
[0.7 0.7 0.7 0.8 1. 0.8]
I see two possible solutions:我看到两种可能的解决方案:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.