简体   繁体   中英

How can i speed up my following algorithm for majority element problemset?

So, I have to write an algorithm for my Coursera assignment in Data Structures. I have used Java for the following problem set.

Problem:- So, lets consider a sequence of numbers in an array of say 5 elements. Number of elements - 5 Array elements - 2, 2 3, 9, 2

  • The majority element algorithm states that if an element appears more than n/2 times then it is the majority element in the array. Hence, my program should output 1(indicates that a majority element found), 0 (no majority element found).

As per the above question- 2 appears 3 times in the array which means n/2 times more (5/2 = 2(integer,ignoring the decimal) + 1 = 3)

So, I was asked to write an algorithm that could solve this problem. The options were divide and conquer (ie breaking the array into two halves and looking for majority element in both halves and then getting the answer) Another option was to scan the elements in the array using two for loop and finally getting the majority element. This is what i tried. I ran through the grader but my program exceeds the time limit. Can anyone suggest any suggestions. Thank You.!

Java Code:-

import java.util.*;

import java.io.*;

public class MajorityElement {
    private static int getMajorityElement(int[] a, int left, int right) {

        int  count = 1;

        int num = a.length/2 + 1;
        Arrays.sort(a);
        if (left == right) {
            return -1;
        }
        if (left + 1 == right) {
            return a[left];
        }

        else
        {



            for(int i=0;i<a.length;i++)
            {
                for(int j=i+1;j<a.length;j++)
                {
                    if(a[i]==a[j])
                    {
                        count++;

                    }
                }

                if(count>1)
                {

                    if(count>=num)
                    {
                        return 1;

                    }
                    i = i + count-1;
                    count = 1;
                }

            }
            return -1;
        }
    }

    public static void main(String[] args) {
        FastScanner scanner = new FastScanner(System.in);
        int n = scanner.nextInt();
        int[] a = new int[n];
        for (int i = 0; i < n; i++) {
            a[i] = scanner.nextInt();
        }
        if (getMajorityElement(a, 0, a.length) != -1) {
            System.out.println(1);
        } else {
            System.out.println(0);
        }

    }
    static class FastScanner {
        BufferedReader br;
        StringTokenizer st;

        FastScanner(InputStream stream) {
            try {
                br = new BufferedReader(new InputStreamReader(stream));
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

        String next() {
            while (st == null || !st.hasMoreTokens()) {
                try {
                    st = new StringTokenizer(br.readLine());
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            return st.nextToken();
        }

        int nextInt() {
            return Integer.parseInt(next());
        }
    }
}

The two-for-loop approach is pretty much just this:

for (int x: a) {
    int count = 0;
    for (int y: a) {
        if (x == y) {
            count++;
            if (count > a.length/2) {
                return true;
            }
         }
    }
}
return false;

That will certainly take too long in cases where there is no majority element, since it will require n^2 comparisons, where n is the number of elements in the list. Don't do that. You can sort first, like a commenter on your question said, which will allow you to break out a little early, but you still have the overhead of sorting, followed by some scanning. That would look something like (NOT TESTED, since it's for you to write):

Arrays.sort(a); // actually I hate this because it mutates your array (BAD!)
for (int i = 0; i < a.length; i++) {
    int count = 0;
    for (int j = i; i < j.length; j++) {
        if (a[j] == a[i]) {
            count++;
            if (count > a.length / 2) {
                return true;
            }
         } else if (a[j] > a[i]) {
             break; // no more to count
         }
    }
}
return false;

You might instead want to go with the divide and conquer approach (n log n). There are also O(n) algorithms, including one by J. Moore, which goes like this:

count = 0
for (int x: a) {
    if (count == 0) {
        candidate = x;
    }
    if (x == candidate) {
        count += 1
    } else {
        count -= 1
    }
}
count = 0;
for (int x: a) if (a==candidate) count++;
return count > a.length / 2;

Treat the above as pseudo code, as it is not tested.

More information on majority element here but it's all in Python so it might not help.

Both of your options do not sound good to me. The problem you described is a standard problem in streaming algorithms (where you have a huge (potentially infinite) stream of data) and you have to calculate some statistics from this stream, passing through this stream once.

It can be solved using Boyer–Moore majority vote algorithm . I do not know Java, but here is my explanation and few lines of python code, which you can surely convert to Java.


The majority element is the element that occurs more than half of the size of the array . This means that the majority element occurs more than all other elements combined or if you count the number of times, majority element appears, and subtract the number of all other elements, you will get a positive number.

So if you count the number of some element, and subtract the number of all other elements and get the number 0 - then your original element can't be a majority element. This if the basis for a correct algorithm:

Have two variables, counter and possible element. Iterate the stream, if the counter is 0 - your overwrite the possible element and initialize the counter, if the number is the same as possible element - increase the counter, otherwise decrease it. Python code:

def majority_element(arr):
    counter, possible_element = 0, None
    for i in arr:
        if counter == 0:
            possible_element, counter = i, 1
        elif i == possible_element:
            counter += 1
        else:
            counter -= 1

    return possible_element

It is clear to see that the algorithm is O(n) with a very small constant before O(n) (like 3). Also it looks like the space complexity is O(1) , because we have only three variable initialized. The problem is that one of these variables is a counter which potentially can grow up to n (when the array consists of the same numbers). And to store the number n you need O(log (n)) space. So from theoretical point of view it is O(n) time and O(log(n)) space. From practical , you can fit 2^128 number in a longint and this number of elements in the array is unimaginably huge.

Also note that the algorithm works only if there is a majority element. If such element does not exist it will still return some number, which will surely be wrong. (it is easy to modify the algorithm to tell whether the majority element exists)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM