简体   繁体   English

在排序的数组中搜索,几乎没有比较

[英]Search in sorted array with few comparisons

You are given a std::vector<T> of distinct items. 给你一个不同项的std::vector<T> which is already sorted. 已经排序了。 type T only supports less-than < operator for comparisons. 类型T仅支持小于 <运算符进行比较。 and it is a heavy function. 这是一个很重要的功能。 so you have to use it as few times as possible. 所以你必须尽可能少使用它。

Is there any better solution than a binary search? 有没有比二分搜索更好的解决方案? If not, is there any better solution than this, that uses less-than operator fewer times? 如果没有,有没有比这更好的解决方案,使用少于运营商的次数更少?

template<typename T>
int FindKey(const std::vector<T>& list, const T& key)
{
    if( list.empty() )
        return -1;

    int left = 0;
    int right = list.size() - 1;
    int mid;

    while( left < right )
    {
        mid = (right + left) / 2;
        if( list[mid] < key )
            left = mid + 1;
        else
            right = mid;
    }

    if( !(key < list[left]) && !(list[left] < key) )
        return left;    

    return -1;
}

It's not a real world situation, just a coding test. 这不是一个现实世界的情况,只是一个编码测试。

You could trade off additional O(n) preprocessing time to get amortized O(1) query time, using a hash table (eg an unordered_map ) to create a lookup table . 您可以使用哈希表 (例如, unordered_map )来计算额外的O(n)预处理时间以获得分摊的O(1)查询时间来创建查找表

Hash tables compute hash functions of the keys and do not compare the keys themselves. 列表计算密钥的散列函数 ,不比较密钥本身。

Two keys could have the same hash, resulting in a collision , explaining why it's not guaranteed that every separate operation is constant time. 两个键可能具有相同的散列,导致冲突 ,解释了为什么不保证每个单独的操作都是恒定时间。 Amortized constant time means that if you carry out k operations that took time t in total, then the quotient t/k = O(1) , for a sufficiently large k . 摊销的常数时间意味着如果你进行总共花费时间t的 k次操作,则商t / k = O(1) ,足够大的k

Live example : 实例

#include <vector>
#include <unordered_map>

template<typename T>
class lookup {
  std::unordered_map<T, int> position;
public:
  lookup(const std::vector<T>& a) {
    for(int i = 0; i < a.size(); ++i) position.emplace(a[i], i);
  }
  int operator()(const T& key) const {
    auto pos = position.find(key);
    return pos == position.end() ? -1 : pos->second;
  }
};

This requires additional memory also. 这也需要额外的内存。

If the values can be mapped to integers and are within a reasonable range (ie max-min = O(n) ), you could simply use a vector as a lookup table instead of unordered_map . 如果值可以映射到整数并且在合理范围内 (即max-min = O(n) ),则可以简单地使用vector作为查找表而不是unordered_map With the benefit of guaranteed constant query time. 具有保证不断查询时间的好处。

See also this answer to "C++ get index of element of array by value" , for a more detailed discussion, including an empirical comparison of linear, binary and hash index lookup. 另请参阅“C ++获取数组元素索引”的答案,以获得更详细的讨论,包括线性,二进制和散列索引查找的经验比较。

Update 更新

If the interface of type T supports no other operations than bool operator<(L, R) , then using the decision tree model you can prove a lower bound for comparison-based search algorithms to be Ω(log n). 如果类型T的接口不支持除bool operator<(L, R)之外的其他操作,则使用决策树模型可以证明基于比较的搜索算法下限为 Ω(log n)。

You can use std::lower_bound . 您可以使用std::lower_bound It does it with log(n)+1 comparisons, which is the best possible complexity for your problem. 它使用log(n)+1比较,这是您的问题的最佳复杂性。

template<typename T>
int FindKey(const std::vector<T>& list, const T& key)
{
  if(list.empty())
    return -1;
  typename std::vector<T>::const_iterator lb =
        std::lower_bound(list.begin(), list.end(), key);
  // now lb is an iterator to the first element
  // which is greater or equal to key
  if(key < *lb)
    return -1;
  else
    return std::distance(list.begin(), lb);
}

With the additionnal check for equality, you do it with log(n)+2 comparisons. 通过额外检查相等性,您可以使用log(n)+2比较进行检查。

You can use interpolation search in log log n time if your numbers are normally distributed. 如果您的数字是正态分布的,您可以在日志日志中使用插值搜索。 If they have some other distribution, you can modify this to take your distribution into account, though I don't know which distributions yield log log time. 如果他们有其他分发,您可以修改它以考虑您的分发,但我不知道哪些分发产生日志日志时间。

https://en.wikipedia.org/wiki/Interpolation_search https://en.wikipedia.org/wiki/Interpolation_search

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM