简体   繁体   中英

How can I best optimize a loop that will loop over a billion times?

Say I want to go through a loop a billion times how could I optimize the loop to get my results faster?

As an example:

double randompoint;
for(long count =0; count < 1000000000; count++) {
        randompoint = (Math.random() * 1) + 0;  //generate a random point
        if(randompoint <= .75) {
            var++; 
        }
    }

I was reading up on vecterization? But I'm not quite sure how to go about it. Any Ideas?

Since Java is cross-platform, you pretty much have to rely on the JIT to vectorize. In your case it can't, since each iteration depends heavily on the previous one (due to how the RNG works).

However, there are two other major ways to improve your computation.

The first is that this work is very amenable to parallelization. The technical term is embarrassingly parallel . This means that multithreading will give a perfectly linear speedup over the number of cores.

The second is that Math.random() is written to be multithreading safe, which also means that it's slow because it needs to use atomic operations. This isn't helpful, so we can skip that overhead by using a non-threadsafe RNG.

I haven't written much Java since 1.5, but here's a dumb implementation:

import java.util.*;
import java.util.concurrent.*;

class Foo implements Runnable {
  private long count;
  private double threshold;
  private long result;

  public Foo(long count, double threshold) {
    this.count = count;
    this.threshold = threshold;
  }

  public void run() {
    ThreadLocalRandom rand = ThreadLocalRandom.current();
    for(long l=0; l<count; l++) {
      if(rand.nextDouble() < threshold)
        result++;
    }
  }

  public static void main(String[] args) throws Exception {
    long count = 1000000000;
    double threshold = 0.75;
    int cores = Runtime.getRuntime().availableProcessors();
    long sum = 0;

    List<Foo> list = new ArrayList<Foo>();
    List<Thread> threads = new ArrayList<Thread>();
    for(int i=0; i<cores; i++) {
      // TODO: account for count%cores!=0
      Foo t = new Foo(count/cores, threshold);
      list.add(t);
      Thread thread = new Thread(t);
      thread.start();
      threads.add(thread);
    }
    for(Thread t : threads) t.join();
    for(Foo f : list) sum += f.result;

    System.out.println(sum);
  }
}

You can also optimize and inline the random generator, to avoid going via doubles. Here it is with code taken from the ThreadLocalRandom docs:

  public void run() {
    long seed = new Random().nextLong();
    long limit = (long) ((1L<<48) * threshold);

    for(int i=0; i<count; i++) {
      seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
      if (seed < limit) ++result;
    }
  }

However, the best approach is to work smarter, not harder. As the number of events increases, the probability tends towards a normal distribution. This means that for your huge range, you can randomly generate a number with such a distribution and scale it:

import java.util.Random;

class StayInSchool {
  public static void main(String[] args) {
    System.out.println(coinToss(1000000000, 0.75));
  }
  static long coinToss(long iterations, double threshold) {
    double mean = threshold * iterations;
    double stdDev = Math.sqrt(threshold * (1-threshold) * iterations);

    double p = new Random().nextGaussian();
    return (long) (p*stdDev + mean);
  }
}

Here are the timings on my 4 core system (including VM startup) for these approaches:

  • Your baseline: 20.9s
  • Single threaded ThreadLocalRandom: 6.51s
  • Single threaded optimized random: 1.75s
  • Multithreaded ThreadLocalRandom: 1.67s
  • Multithreaded optimized random: 0.89s
  • Generating a gaussian: 0.14s

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM