简体   繁体   中英

Making a Parallel IntStream more efficient/faster?

I've looked for a while for this answer, but couldn't find anything.

I'm trying to create an IntStream that can very quickly find primes (many, many primes, very fast -- millions in a few seconds).

I currently am using this parallelStream:

import java.util.stream.*;
import java.math.BigInteger;

public class Primes {
    public static IntStream stream() {
        return IntStream.iterate( 3, i -> i + 2 ).parallel()
                .filter( i -> i % 3 != 0 ).mapToObj( BigInteger::valueOf )
                .filter( i -> i.isProbablePrime( 1 ) == true )
                .flatMapToInt( i -> IntStream.of( i.intValue() ) );
    }
}

but it takes too long to generate numbers. (7546ms to generate 1,000,000 primes).

Is there any obvious way to making this more efficient/faster?

There are two general problems for efficient parallel processing with your code. First, using iterate , which unavoidably requires the previous element to calculate the next one, which is not a good starting point for parallel processing. Second, you are using an infinite stream. Efficient workload splitting requires at least an estimate of the number of element to process.

Since you are processing ascending integer numbers, there is an obvious limit when reaching Integer.MAX_VALUE , but the stream implementation doesn't know that you are actually processing ascending numbers, hence, will treat your formally infinite stream as truly infinite.

A solution fixing these issues, is

public static IntStream stream() {
    return IntStream.rangeClosed(1, Integer.MAX_VALUE/2).parallel()
            .map(i -> i*2+1)
            .filter(i -> i % 3 != 0).mapToObj(BigInteger::valueOf)
            .filter(i -> i.isProbablePrime(1))
            .mapToInt(BigInteger::intValue);
}

but it must be emphasized that in this form, this solution is only useful if you truly want to process all or most of the prime numbers in the full integer range. As soon as you apply skip or limit to the stream, the parallel performance will drop significantly, as specified by the documentation of these methods. Also, using filter with a predicate that accepts values in a smaller numeric range only, implies that there will be a lot of unnecessary work that would better not be done than done in parallel.

You could adapt the method to receive a value range as parameter to adapt the range of the source IntStream to solve this.

This is the time to emphasize the importance of algorithms over parallel processing. Consider the Sieve of Eratosthenes . The following implementation

public static IntStream primes(int max) {
    BitSet prime = new BitSet(max>>1);
    prime.set(1, max>>1);
    for(int i = 3; i<max; i += 2)
        if(prime.get((i-1)>>1))
            for(int b = i*3; b>0 && b<max; b += i*2) prime.clear((b-1)>>1);
    return IntStream.concat(IntStream.of(2), prime.stream().map(i -> i+i+1));
}

turned out to be faster by an order of magnitude compared to the other approaches despite not using parallel processing, even when using Integer.MAX_VALUE as upper bound (measured using a terminal operation of .reduce((a,b) -> b) instead of toArray or forEach(System.out::println) , to ensure complete processing of all values without adding additional storage or printing costs).

The takeaway is, isProbablePrime is great when you have a particular candidate or want to process a small range of numbers (or when the number is way outside the int or even long range)¹, but for processing a large ascending sequence of prime numbers there are better approaches, and parallel processing is not the ultimate answer to performance questions.


¹ consider, eg

Stream.iterate(new BigInteger("1000000000000"), BigInteger::nextProbablePrime)
      .filter(b -> b.isProbablePrime(1))

It seems that I can do 1/2 better than what you have in place, by doing some modifications:

return IntStream.iterate(3, i -> i + 2)
            .parallel()
            .unordered()
            .filter(i -> i % 3 != 0)
            .mapToObj(BigInteger::valueOf)
            .filter(i -> i.isProbablePrime(1))
            .mapToInt(BigInteger::intValue);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM