简体   繁体   中英

find major elements appears more than n/3 times

Working on below algorithm puzzle and debugging below solution works, for a few test cases. My confusion and question is, how could we always guarantee the count for an elements appears more than n/3 times have a positive count? There are another 2n/3 elements which could make it count negative? But I tried and it always work in my samples. If anyone could help to clarify, it will be great.

Here are the problem statement and my code/test cases,

Given an integer array of size n, find all elements that appear more than ⌊ n/3 ⌋ times. The algorithm should run in linear time and in O(1) space.

def majorityElement(nums):
    if not nums:
        return []
    count1, count2, candidate1, candidate2 = 0, 0, 0, 0
    for n in nums:
        if n == candidate1:
            count1 += 1
        elif n == candidate2:
            count2 += 1
        elif count1 == 0:
            candidate1, count1 = n, 1
        elif count2 == 0:
            candidate2, count2 = n, 1
        else:
            count1, count2 = count1 - 1, count2 - 1
    return [n for n in (candidate1, candidate2) if nums.count(n) > len(nums) // 3]

if __name__ == "__main__":

    # print majorityElement([1,2,1,3,1,5,6])
    print majorityElement([2,3,1,2,1,3,1,5,5,1,6])

thanks in advance, Lin

Conceptually, we repeatedly apply a reduction operation to the list that involves deleting three pairwise distinct items. This particular code does reductions online, so that the reduced list so far can be described by two different elements and their corresponding counts (because if there were a third element distinct from the other two, then we could reduce). At the end, we consider at most two elements for occurring more than n/3 times.

The interesting part of the correctness proof is a lemma that, whenever we perform this reduction operation, any element that occurred more n/3 times in the old list occurs more than n'/3 times in the new list, where n is the length of the old list and n' = n-3 is the length of the new list. This ensures by induction that the final list contains all elements occurring more than n/3 times in the initial list (but of course the final list contains only two distinct elements).

The proof of the lemma is that, if an item occurs k times out of n in the old list, then at worst it occurs k-1 times out of n-3 in the new list, and if k/n > 1/3, then

              (k-1) n
(k-1)/(n-3) = -------
              (n-3) n

              k (n-3) + 3 k - n
            = -----------------
                   (n-3) n

                      (k/n - 1/3)
            = k/n + 3 -----------
                          n-3

            > 1/3.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM