简体   繁体   English

查找最近值算法

[英]Find closest value algorithm

def find_closest(data, target, key = lambda x:f(x))

This is my function definition where data is set of values, and I want to find the value that evaluates the closest to target in as few evaluations as possible, ie abs(target-f(x)) is minimum. 这是我的函数定义,其中的数据是一组值,我想在尽可能少的评估中找到评估最接近目标的值,即abs(target-f(x))最小。 f(x) is monotonic. f(x)是单调的。

I've heard that binary search can do this in O(log(n)) time, is there a library implementation in python? 我听说二进制搜索可以在O(log(n))时间内做到这一点,Python是否有库实现? Are there more efficient search algorithms? 有更有效的搜索算法吗?

EDIT: I'm looking to minimize complexity in terms of evaluating f(x) because that's the expensive part. 编辑:我正在寻求在评估f(x)方面最小化复杂性,因为那是昂贵的部分。 I want to find the x in data that when evaluated with f(x), comes closest to the target. 我想找到用f(x)评估时最接近目标的数据中的x。 data is in the domain of f , target is in the range of f . dataf的范围内, targetf的范围内。 Yes, data can be sorted quickly. 是的,可以快速排序数据。

You can use the utilities in the bisect module. 您可以使用bisect模块中的实用程序。 You will have to evaluate x on data though, ie list(f(x) for x in data) to get a monotonic / sorted list to bisect. 但是,您将不得不对data进行x评估,即list(f(x) for x in data)以获得单调/排序后的列表,将其一分为二。

I am not aware of a binary search in the standard library that works directly on f and data . 我不知道直接在fdata上运行的标准库中的二进制搜索。

If the data presented is already sorted and the function is strctly monotonic, apply the function f on the data and then perform a binary search using bisect.bisect 如果提供的数据已经排序并且函数是单调的,则将函数f应用于数据,然后使用bisect.bisect执行二进制搜索。

import bisect
def find_closest(data, target, key = f):

    data = map(f, data)
    if f(0) > f(1):
        data = [-e for e in data]
    try:
        return data[bisect.bisect_left(data, target)]
    except IndexError:
        return data[-1]

Use bisect_left() method to find lower bound. 使用bisect_left()方法查找下限。 Bisect_left accepts a random-access list of elements, to avoid calculating all of them you can use lazy collection of calculated function values with __len__ and __getitem__ methods defined. Bisect_left接受元素的随机访问列表,为避免计算所有元素,可以使用已定义__len____getitem__方法的函数值的延迟集合。 Carefully check return value for border conditions. 仔细检查边界条件的返回值。 Your heavy calculation will be called O(log(N) + 1) = O(log(N)) times. 您的大量计算将称为O(log(N) + 1) = O(log(N))倍。

from bisect import bisect_left
from collections import defaultdict

class Cache(defaultdict):
    def __init__(self, method):
        self.method = method
    def __missing__(self, key):
        return self.method(key)

class MappedList(object):
    def __init__(self, method, input):
        self.method = method
        self.input = input
        self.cache = Cache(method)
    def __len__(self):
        return len(self.input)
    def __getitem__(self, i):
        return self.cache[input[i]]

def find_closest(data, target, key = lambda x:x):
    s = sorted(data)
    evaluated = MappedList(key, s)
    index = bisect_left(evaluated, target)
    if index == 0:
        return data[0]
    if index == len(data):
        return data[index-1]
    if target - evaluated[index-1] <= evaluated[index] - target:
        return data[index-1]
    else:
        return data[index]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM