简体   繁体   中英

My Java Sieve code is slow and not scaling at the expected time complexity

I have wrote the following 'segmented sieve' program in Java. It take a range of numbers to sieve, crosses out composite numbers using the 'sieving primes' (primes arraylist variable) then returns the prime numbers that have not been crossed out. Here is the code:

public ArrayList<Integer> sieveWorker(int start, int last, ArrayList<Integer> primes) {

    System.out.println("Thread started for range: " + start + "-" + last);
    ArrayList<Integer> nonPrimes = new ArrayList<Integer>();
    ArrayList<Integer> primeNumbers = new ArrayList<Integer>();
    ArrayList<Integer> numbers = new ArrayList<Integer>();

    //numbers to be sieved
    for (int i = start; i <= last; i += 2) {
        numbers.add(i);
    }

    //identifies composites of the sieving primes, then stores them in an arraylist
    for (int i = 0; i < primes.size(); i++) {

        int head = primes.get(i);

        if ((head * head) <= last) {
            if ((head * head) >= start) {
                for (int j = head * head; j <= last; j += head * 2) {
                    nonPrimes.add(j);
                }
            } else {
                int k = Math.round((start - head * head) / (2 * head));
                for (int j = (head * head) + (2 * k * head); j <= last; j += head * 2) {
                    nonPrimes.add(j);
                }
            }
        }

    }

    numbers.removeAll(nonPrimes);
    System.out.println("Primes: " + numbers);
    return numbers;
}

My problem is that it's very slow and performing at a time complexity of o(n^3) instead of the expected time of complexity of o(n log log n). I need suggestions on optimisation and correcting its time complexity.

The culprit is the numbers.removeAll(nonPrimes) call which for each number in numbers (and there are O(n) of them) searches through all of nonPrimes potentially (and there are O(n log log last) of them) to check the membership (and nonPrimes is non-sorted, too). n is the length of numbers , n = last - start .

So instead of O(1) marking of each non-prime you have an O(n log log last) actual removal of it, for each of the O(n) of them. Hence the above O(n^2) operations overall.

One way to overcome this is to use simple arrays, and mark the non-primes. Removal destroys the direct address capability. If use it at all, the operations must be on-line , with close to O(1) operations per number. This can be achieved by making the non-primes be a sorted list, then to remove them from numbers iterate along both in linear fashion. Both tasks easiest done with arrays, again.

Explanation

numbers.removeAll(nonPrimes);

must find elements. That's basically contains and contains on ArrayList is slow, O(n) .

It iterates the whole list from left to right and removes the matching elements. And it does this for every element in your nonPrimes collection. So you will get a complexity of O(n * |nonPrimes|) just for the removeAll part.


Solution

There is an easy fix, exchange your data-structure. Structures like HashSet where made for O(1) contains queries. Since you only need to add and removeAll on numbers , consider using a HashSet instead, which runs both in O(1) (ammortized).

Only change in code:

Set<Integer> numbers = new HashSet<>();

Another possibility is to do some algorithmic changes. You can avoid the removeAll in the end by marking the elements while you collect them. The advantage is that you could use arrays then. The big advantage then is that you avoid the boxed Integer class and directly run on the primitives int which are faster and don't consume as much space. Check the answer of @Will_Ness for details on this approach.


Note

Your primeNumbers variable is never used in your method. Consider removing it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM