[英]Find closest value algorithm
def find_closest(data, target, key = lambda x:f(x))
This is my function definition where data is set of values, and I want to find the value that evaluates the closest to target in as few evaluations as possible, ie abs(target-f(x))
is minimum. 这是我的函数定义,其中的数据是一组值,我想在尽可能少的评估中找到评估最接近目标的值,即
abs(target-f(x))
最小。 f(x)
is monotonic. f(x)
是单调的。
I've heard that binary search can do this in O(log(n)) time, is there a library implementation in python? 我听说二进制搜索可以在O(log(n))时间内做到这一点,Python是否有库实现? Are there more efficient search algorithms?
有更有效的搜索算法吗?
EDIT: I'm looking to minimize complexity in terms of evaluating f(x) because that's the expensive part. 编辑:我正在寻求在评估f(x)方面最小化复杂性,因为那是昂贵的部分。 I want to find the x in data that when evaluated with f(x), comes closest to the target.
我想找到用f(x)评估时最接近目标的数据中的x。
data
is in the domain of f
, target
is in the range of f
. data
在f
的范围内, target
在f
的范围内。 Yes, data can be sorted quickly. 是的,可以快速排序数据。
You can use the utilities in the bisect
module. 您可以使用
bisect
模块中的实用程序。 You will have to evaluate x
on data
though, ie list(f(x) for x in data)
to get a monotonic / sorted list to bisect. 但是,您将不得不对
data
进行x
评估,即list(f(x) for x in data)
以获得单调/排序后的列表,将其一分为二。
I am not aware of a binary search in the standard library that works directly on f
and data
. 我不知道直接在
f
和data
上运行的标准库中的二进制搜索。
If the data presented is already sorted and the function is strctly monotonic, apply the function f
on the data and then perform a binary search using bisect.bisect
如果提供的数据已经排序并且函数是单调的,则将函数
f
应用于数据,然后使用bisect.bisect
执行二进制搜索。
import bisect
def find_closest(data, target, key = f):
data = map(f, data)
if f(0) > f(1):
data = [-e for e in data]
try:
return data[bisect.bisect_left(data, target)]
except IndexError:
return data[-1]
Use bisect_left()
method to find lower bound. 使用
bisect_left()
方法查找下限。 Bisect_left
accepts a random-access list of elements, to avoid calculating all of them you can use lazy collection of calculated function values with __len__
and __getitem__
methods defined. Bisect_left
接受元素的随机访问列表,为避免计算所有元素,可以使用已定义__len__
和__getitem__
方法的函数值的延迟集合。 Carefully check return value for border conditions. 仔细检查边界条件的返回值。 Your heavy calculation will be called
O(log(N) + 1) = O(log(N))
times. 您的大量计算将称为
O(log(N) + 1) = O(log(N))
倍。
from bisect import bisect_left
from collections import defaultdict
class Cache(defaultdict):
def __init__(self, method):
self.method = method
def __missing__(self, key):
return self.method(key)
class MappedList(object):
def __init__(self, method, input):
self.method = method
self.input = input
self.cache = Cache(method)
def __len__(self):
return len(self.input)
def __getitem__(self, i):
return self.cache[input[i]]
def find_closest(data, target, key = lambda x:x):
s = sorted(data)
evaluated = MappedList(key, s)
index = bisect_left(evaluated, target)
if index == 0:
return data[0]
if index == len(data):
return data[index-1]
if target - evaluated[index-1] <= evaluated[index] - target:
return data[index-1]
else:
return data[index]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.