简体   繁体   English

选择算法在O(n)中查找超过n / 2的元素

[英]select algorithm to find element occurring more than n/2 in O(n)

Without using Moore's algorithm or hash table, I need to find the element that occurs more than n/2 times in an array. 不使用摩尔算法或哈希表,我需要找到在数组中出现超过n/2次的元素。

I know I have to use the select algorithm using median of median. 我知道我必须使用中位数中值的选择算法。 I am confused as to what select algorithm returns because if it returns the median how am I sure the element occurs more than n/2 times in the array? 关于什么选择算法返回我很困惑,因为如果它返回中位数我怎么确定元素在数组中出现超过n/2次?

For example: 例如:

a [] = {4, 1, 5, 7, 8} a [] = {4,1,5,7,8}

5 is the median but it doesn't occur more than n/2 times. 5是中位数但不超过n/2次。

Now: 现在:

a[] = {5, 5, 3, 4, 5, 5} a [] = {5,5,3,4,5,5}

In this case the median is 5 and it occurs more than n/2 times. 在这种情况下,中位数是5,并且它发生超过n/2次。

I'd like to suggest another method. 我想建议另一种方法。 This problem is called "finding the leader" (at least in Polish literature). 这个问题被称为“寻找领导者”(至少在波兰文学中)。 Let's call an element occuring more than n/2 times the leader of a sequence. 让我们称一个元素的出现次数超过序列前导的n/2倍。 The following observation is crucial - if there exists a leader in a sequence, after removing two different elements, the newly created sequence will have exactly the same leader as the original one. 以下观察是至关重要的 - 如果序列中存在领导者,则在移除两个不同元素之后,新创建的序列将具有与原始序列完全相同的领导者。 Why is that? 这是为什么? If there is a leader, after removing two different elements, exactly one of them is the leader. 如果有一个领导者,在删除两个不同的元素后,其中只有一个是领导者。 The new sequence has n - 2 elements and more than (n / 2) - 1 occurences of the original leader, hence the original leader is the new leader. 新序列具有n - 2元素,并且超过原始领导者的(n / 2) - 1出现,因此最初的领导者是新的领导者。 You repeat the deletion until all elements are equal. 重复删除,直到所有元素相等。 Then you can perform a linear check if the candidate is a leader. 然后,如果候选人是领导者,您可以执行线性检查。

Sample code (based on this article , unfortunately unavailable in English): 示例代码(基于这篇文章 ,很遗憾没有英文版):

int leader = 0;
int number = 0; /* number of occurences of a leader candidate */
for (int k = 0; k < n; k++)
{
    if (number == 0)
    {
        //we set first element as a potential leader
        leader = a[k];
        number = 1;
    }
    else if (leader == a[k])
        //new leader occurence
        number++;
    else
        //delete two different elements
        number--;
}
//check if it really is the leader
number = 0;
for (int i = 0; i < n; i++)
    if (a[i] == leader)
        number++;
if (number > n / 2)
    System.out.println(leader);
else
    System.out.println("There is no leader");

Use select algorithm to find the the median element. 使用select算法查找中值元素。 Here if the element is more than n/2 times it is obvious that it is the median. 在这里,如果元素超过n / 2次,很明显它是中位数。 Use Median of median to get running time of O(n). 使用中位数中位数来获得O(n)的运行时间。 The once you get the median element just count its occurrence in array which is again in linear time 一旦得到中间元素,只计算它在数组中的出现次数,这也是线性时间

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM