[英]Quick way of matching date range to indices in Python list of dates
I need to select the index of the earliest date which is no more than interval
days before date1
(with index i1
). 我需要选择最早日期的索引,该索引不超过
date1
之前的interval
天(索引为i1
)。 I have a sorted list dates
and this is the snippet of what I'm trying to do: 我有一个排序的列表
dates
,这是我要执行的操作的摘要:
for i1 in mylist:
date1 = dates[i1]
i0 = sum(1 for d in dates if date1 - d > timedelta(days = interval))
# do some other stuff with this
The line where I find i0
seems to be the bottleneck of this loop, because if I change it to i0 = max(0, i1 - 30)
(which simply ignores missing dates), it works about 100 times quicker. 我找到
i0
的那一行似乎是该循环的瓶颈,因为如果我将其更改为i0 = max(0, i1 - 30)
(它只是忽略丢失的日期),它的运行速度将提高大约100倍。
Is there a way to speed it up? 有没有办法加快速度? I feel like there should be a way of using the fact that the list is sorted and avoid doing all the comparisons.
我觉得应该有一种利用列表已排序的事实的方式,并避免进行所有比较。
PS: My first try at it was: PS:我的第一次尝试是:
i0 = len([d for d in dates if date1 - d > timedelta(days = interval)])
which is even slower. 这甚至更慢。
I need to select the index of the earliest date which is no more than
interval
days before date1 (with indexi1
).我需要选择最早日期的索引,该索引不超过date1之前的
interval
天(索引为i1
)。 I have a sorted listdates
我有一个列表
dates
排序
Using binary search ( O(log n)
time complexity): 使用二进制搜索(
O(log n)
时间复杂度):
import bisect
i = bisect.bisect_left(dates, dates[i1]-timedelta(days=interval))
Paraphrasing bisect
's documentation : the return value i
is such that all dates in the slice dates[:i]
are more than ( >
) interval
days before date[i1]
, and all dates in the slice dates[i:]
are less than or exactly ( <=
) interval
days before dates[i1]
. 解释
bisect
的文档 :返回值i
使得切片dates[:i]
中的所有日期都大于 date[i1]
之前的( >
)个interval
天,而切片dates[i:]
中的所有日期都小于大于或等于 dates[i1]
之前的interval
天( <=
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.