简体   繁体   English

有没有更有效的方法来做这个算法?

[英]Is there a more efficient way to do this algorithm?

To the best of my knowledge, this algorithm will search correctly and turn out true when it needs too. 据我所知,此算法将正确搜索,并在需要时结果为true。 In class we are talking about Big O analysis so this assignment is to show how the recursive search is faster than an iterative search. 在课堂上我们讨论的是Big O分析,所以这个分配是为了说明递归搜索比迭代搜索更快。 The point is to search for a number such that A[i] = i (find an index that is the same as the number stored at the index). 关键是要搜索一个数字,使得A [i] = i(找到与索引中存储的数字相同的索引)。 This algorithm versus an iterative one only varies by about 100 nanoseconds, but sometimes the iterative one is faster. 该算法与迭代算法相比仅变化约100纳秒,但有时迭代算法更快。 I set up the vector in main using the rand() function. 我使用rand()函数在main中设置了向量。 I run the two algorithms a million times and record the times. 我运行两百万次算法并记录时间。 The question I am asking is, is this algorithm as efficient as possible or is there a better way to do it? 我问的问题是,这种算法尽可能高效还是有更好的方法呢?

bool recursiveSearch(vector<int> &myList, int beginning, int end)
{
    int mid = (beginning + end) / 2;

    if (myList[beginning] == beginning) //check if the vector at "beginning" is
    {                                      //equal to the value of "beginning"
        return true;
    }

    else if (beginning == end) //when this is true, the recursive loop ends.
    {                          //when passed into the method: end = size - 1
        return false;
    }

    else
    {
        return (recursiveSearch(myList, beginning, mid) || recursiveSearch(myList, mid + 1, end));
    }

}

Edit: The list is pre-ordered before being passed in and a check is done in main to make sure that beginning and the end both exist 编辑:列表在传入之前已预先订购,并在main中完成检查以确保开头和结尾都存在

一种可能的“改进”是不通过传递引用来复制每次递归中的向量:

bool recursiveSearch(const vector<int>& myList, int beginning, int end)

Unless you know something special about the ordering of the data, there is absolutely no advantage to performing a partitioned search like this. 除非您对数据的排序有所了解,否则执行这样的分区搜索绝对没有优势。

Indeed, your code is actually [trying] to do a linear search, so it is actually implementing a simple for loop with the cost of a lot of stack and overhead. 实际上,您的代码实际上是[尝试]进行线性搜索,因此它实际上实现了一个简单的for循环,其中包含大量堆栈和开销。

Note that there is a weirdness in your code: If the first element doesn't match, you will call recursiveSearch(myList, beginning /*=0*/, mid) . 请注意,您的代码中存在一种奇怪现象:如果第一个元素不匹配,您将调用recursiveSearch(myList, beginning /*=0*/, mid) Since we already know that element 0 doesn't match, you're going to subdivide again, but only after re-testing the element. 由于我们已经知道元素0不匹配,因此您将再次细分,但仅在重新测试元素之后。

So given a vector of 6 elements that has no matches, you're going to call: 因此,给定一个没有匹配的6个元素的向量,您将调用:

recursiveSearch(myList, 0, 6); recursiveSearch(myList,0,6); -> < recursiveSearch(myList, 0, 3) || - > <recursiveSearch(myList,0,3)|| recursiveSearch(myList, 4, 6); recursiveSearch(myList,4,6); > -> < recursiveSearch(myList, 0, 1) || > - > <recursiveSearch(myList,0,1)|| recursiveSearch(2, 3) > < recursiveSearch(myList, 4, 5); recursiveSearch(2,3)> <recursiveSearch(myList,4,5); || || recursiveSearch(myList, 5, 6); recursiveSearch(myList,5,6); > -> < recursiveSearch(myList, 0, 0) || > - > <recursiveSearch(myList,0,0)|| recursiveSearch(myList, 1, 1) > < recursiveSearch(myList, 2, 2) || recursiveSearch(myList,1,1)> <recursiveSearch(myList,2,2)|| recursiveSearch(myList, 3, 3) > ... recursiveSearch(myList,3,3)> ...

In the end, you're failing on a given index because you reached the condition where begin and end were both that value, that seems an expensive way of eliminating each node, and the end-result is not a partitioned search, it a simple linear search, you just use a lot of stack-depth to get there. 最后,你在一个给定的索引上失败了,因为你达到了开始和结束都是那个值的条件,这似乎是一种消除每个节点的昂贵方法,而最终结果不是分区搜索,它很简单线性搜索,你只需要使用大量的堆栈深度来实现。

So, a simpler and faster way to do this would be: 因此,更简单,更快捷的方法是:

for (size_t i = beginning; i < end; ++i) {
    if (myList[i] != i)
        continue;
    return i;
}

Since we're trying to optimize here, it's worth pointing out that MSVC, GCC and Clang all assume that if expresses the likely case, so I'm optimizing here for the degenerate case where we have a large vector with no or late matches. 由于我们在这里尝试优化,值得指出的是MSVC,GCC和Clang都假设if表达了可能的情况,那么我在这里针对退化情况进行优化,其中我们有一个没有或没有后期匹配的大向量。 In the case where we get lucky and we find a result early, then we're willing to pay the cost of a potential branch miss because we're leaving. 如果我们很幸运并且我们很早就找到了结果,那么我们愿意支付潜在分支机构的费用,因为我们要离开。 I realize that the branch cache will soon figure this out for us, but again - optimizing ;-P 我意识到分支缓存很快就会为我们解决这个问题,但是再次 - 优化;-P

As others have pointed out, you could also benefit from not passing the vector by value (forcing a copy) 正如其他人所指出的那样,你也可以从不按值传递向量(强制复制)中受益

const std::vector<int>& myList

An obvious "improvement" would be to run threads on all the remaining cores. 一个明显的“改进”是在所有剩余的核心上运行线程。 Simply divvy up the vector into number of cores - 1 pieces and use a condition variable to signal the main thread when found. 只需将vector分成number of cores - 1并使用条件变量在发现时发出主线程信号。

If you need to find an element in an unsorted array such that A[i] == i , then the only way to do it is to go through every element until you find one. 如果你需要在一个未排序的数组中找到一个元素,使得A[i] == i ,那么唯一的方法就是遍历每个元素,直到找到它为止。

The simplest way to do this is like so: 最简单的方法就像这样:

bool find_index_matching_value(const std::vector<int>& v)
{
    for (int i=0; i < v.size(); i++) {
        if (v[i] == i)
            return true;
    }
    return false; // no such element
}

This is O(n) , and you're not going to be able to do any better than that algorithmically. 这是O(n) ,你不可能在算法上做得更好。 So we have to turn our attention to micro-optimisations. 因此,我们必须将注意力转向微观优化。

In general, I would be quite astonished if on modern machines, your recursive solution is faster in general than the simple solution above. 总的来说,如果在现代机器上,你的递归解决方案通常比上面的简单解决方案更快,我会非常惊讶。 While the compiler will (possibly) be able to remove the extra function call overhead (effectively turning your recursive solution into an iterative one), running through the array in order (as above) allows for optimal use of the cache, whereas, for large arrays, your partitioned search will not. 虽然编译器(可能)能够删除额外的函数调用开销(有效地将递归解决方案转换为迭代解决方案),但按顺序运行数组(如上所述)可以最佳地使用缓存,而对于大型缓存数组,您的分区搜索不会。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM