[英]How do I merge two time series with different time values in python?
我有:
highhz = [(0,1),(2,2),(4,4),(5,5),(6,6),(7,7),(8,8)]
lowhz= [(1.5,1.5),(5.6,5.6)]
我想:
alldata = [(0,1,1.5),
(2,2,NaN),
(4,4,NaN),
(5,5,5.6),
(6,6,NaN),
(7,7,NaN),
(8,8,NaN)]
也就是說,將來自第二個低頻源的值附加到高頻源中的縱坐標值,以形成一個組合表,其中包含沒有低頻數據的高頻源和NaN的時間縱坐標。
任何想法如何在python中進行此操作? 在C語言中,我將使用兩個移動指針,而在Lisp中,我將使用遞歸指針,但是即使我可以將這些算法整合到python中,它們也不是慣用語言。
這是使用collections.OrderedDict
和bisect.bisect_left
的一種方法:
from collections import OrderedDict
from bisect import bisect_left
from pprint import pprint
dct = OrderedDict()
for t, v in highhz:
dct.setdefault(t, []).append(v)
times = list(dct)
for t, v in lowhz:
ind = bisect_left(times, t) - 1
dct[times[ind]].append(v)
#----
for k, v in dct.items():
if len(v) == 1:
v.append(float('nan'))
#----
print [[k] + v for k, v in dct.items()]
#[[0, 1, 1.5], [2, 2, nan], [4, 4, nan], [5, 5, 5.6], [6, 6, nan], [7, 7, nan], [8, 8, nan]]
上面代碼的略微修改版本,如果兩次之間的項目數大於1,則會插入均勻數量的NaN
:
highhz = [(0, 1), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8)]
lowhz = [(1.5, 1.5), (2, 2), (5.6, 5.6), (5.7, 10), (5.8, 20)]
#----
max_n = len(max(dct.values(), key=len))
for k, v in dct.items():
le = len(v)
v.extend([float('nan')]*(max_n-le))
#----
pprint([[k] + v for k, v in dct.items()])
[[0, 1, 1.5, 2, nan],
[3, 3, nan, nan, nan],
[4, 4, nan, nan, nan],
[5, 5, 5.6, 10, 20],
[6, 6, nan, nan, nan],
[7, 7, nan, nan, nan],
[8, 8, nan, nan, nan]]
它能解決您的問題嗎?
highhz = [(0,1),(2,2),(4,4),(5,5),(6,6),(7,7),(8,8)]
lowhz= [(1.5,1.5),(5.6,5.6)]
# hash lowhz tuples with floor from the first value
low_d = {int(x) : x for x, _ in lowhz}
# {1: 1.5 5: 5.6}
# use fact, that dict.get() takes default value as optional argument
result = [(x, y, low_d.get(y, None)) for x, y in highhz]
# or as @Ashwini Chaudhary suggested:
result = [(x, y, low_d.get(y, float('nan'))) for x, y in highhz]
# [(0, 1, (1.5, 1.5)),
# (2, 2, None),
# (4, 4, None),
# (5, 5, (5.6, 5.6)),
# (6, 6, None),
# (7, 7, None),
# (8, 8, None)]
您可以使用列表比較,但它會產生一個額外的(0, 1, 'Nan')
我不知道為什么? :)
>>> [i+(max(j),) if max(i)<max(j)<max(i)+1 else i+('Nan',) for i in highhz for j in lowhz]
[(0, 1, 1.5), (0, 1, 'Nan'), (2, 2, 'Nan'), (2, 2, 'Nan'), (4, 4, 'Nan'), (4, 4, 'Nan'), (5, 5, 'Nan'), (5, 5, 5.6), (6, 6, 'Nan'), (6, 6, 'Nan'), (7, 7, 'Nan'), (7, 7, 'Nan'), (8, 8, 'Nan'), (8, 8, 'Nan')]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.