简体   繁体   English

在std :: vector中进行二进制搜索

[英]Binary search in std::vector

I am trying to look for the position of vector elements into another vector. 我试图寻找向量元素在另一个向量中的位置。 Here i am interested to use an implementation as fast as binary search . 在这里,我有兴趣使用像binary search一样快的实现。 I have different vectors of length 1 million or more, so i am trying to achieve something faster. 我有不同的长度为100万或更多的向量,所以我想要更快地实现某些目标。

Following situations in my case: 在我的情况下以下情况:

1) vector in which i am searching is sorted. 1)我正在搜索的vector被排序。

2) The element i am searching for will always be there ie i don't have a case of not found , and i would like to get the index of vector elements in a faster way. 2)我正在寻找的元素总是存在的,即我没有的情况下not found ,我想获得向量元素的索引以更快的方式。

I tried the following code to get the index of vector elements. 我尝试了以下代码来获取向量元素的索引。

#include <iostream>
#include <vector>
#include <algorithm>

template<class Iter, class T>
Iter binary_find(Iter begin, Iter end, T val)
{
    Iter i = std::lower_bound(begin, end, val);
    return i;
}

int main() {
    std::vector<std::string> values = {"AAAAAA","AB", "AD" ,"BCD","CD", "DD" };
    std::vector<std::string> tests = {"AB", "CD","AD", "DD"};
    for(int i=0 ; i < tests.size(); i++) {
        int pos = binary_find(values.begin(), values.end(), tests.at(i))- values.begin();
    std::cout << tests.at(i) << " found at: " << pos <<std::endl;
    }
    return 0;
}  

I would like to know if the code matches with the binary search implementation.?? 我想知道代码是否与二进制搜索实现匹配。

Is there a faster way to get the index of vector elements? 是否有更快的方法来获取向量元素的索引?

Any further suggestions to improve this code. 任何进一步的建议,以改善此代码。

binary_find doesn't return anything despite not declared to return void , so it has undefined behaviour. 尽管未声明返回 voidbinary_find不会返回任何内容,因此它具有未定义的行为。

\n

After it is fixed, and assuming that you have no specific knowledge about the contents of the vector other than it is sorted, binary search is pretty much optimal. 在它被修复之后,并且 假设你没有关于向量的内容的特定知识而不是它被排序,二进制搜索是非常优化的。

There are however, other data structures that are faster for predicate based lookup than a vector. 但是,对于基于谓词的查找,其他数据结构比向量更快。 If performance is critical, you should take a look at search trees and hash maps. 如果性能至关重要,您应该查看搜索树和哈希映射。 Since your keys are strings, tries and directed acyclic word graph in particular may be efficient. 由于您的键是字符串,因此特别是尝试和定向非循环字图可能是有效的。 You may want to measure which is best for your use case. 您可能想要衡量哪个最适合您的用例。

http://www.cpluplus.com says that the behavior of binary_search is equivalent to: http://www.cpluplus.combinary_search的行为相当于:

template <class ForwardIterator, class T>
bool binary_search (ForwardIterator first, ForwardIterator last, const T& val) {
    first = std::lower_bound(first, last, val);
    return (first != last && !(val < *first));
}

So yes, lower_bound is your weapon of choice. 所以是的, lower_bound是你的首选武器。 But when you take the difference you should use distance . 但是当你采取差异时,你应该使用distance Cause, if there is a faster way to acquire the position it will be rolled into that function. 因为,如果有更快的方式获取位置,它将被转入该功能。

As far as other improvements, I'd suggest using C++14's begin and end and not calling a function which only serves to wrap a lower_bound (and fail to properly return a value.) So the way I'd write this code would look like: 至于其他改进,我建议使用C ++ 14的beginend而不是调用只用于包装lower_bound的函数(并且无法正确返回值。)所以我编写这段代码的方式会看起来像:

auto pos = distance(lower_bound(begin(values), end(values), tests[i]), begin(values));

Q1: I would like to know if the code matches with the binary search implementation.?? Q1:我想知道代码是否与二进制搜索实现相匹配。

Yes , it ( almost ) is. 是的 ,它( 几乎 )是。 Check std::lower_bound , which states: 检查std :: lower_bound ,其中指出:

Complexity: 复杂:

On average, logarithmic in the distance between first and last: Performs approximately log2(N)+1 element comparisons (where N is this distance). 平均而言,第一个和最后一个之间距离的对数:执行大约log2(N)+1个元素比较(其中N是该距离)。 On non-random-access iterators, the iterator advances produce themselves an additional linear complexity in N on average. 在非随机访问迭代器上,迭代器的进展产生了N平均额外的线性复杂度。


Q2: Is there a faster way to get the index of vector elements.?? Q2:有没有更快的方法来获取向量元素的索引。

That's a rather broad question. 这是一个相当广泛的问题。


Q3: Any further suggestions to improve this code. 问题3:有关改进此代码的任何进一步建议。

Hello world, Code Review ! Hello world, Code Review


PS - Did you even compile the code? PS - 你甚至编译代码了吗? It gives several messages, like: 它提供了几条消息,例如:

warning: no return statement in function returning non-void [-Wreturn-type]

Compile with warnings enabled, like this: 编译并启用警告,如下所示:

g++ -Wall main.cpp

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM