为什么我的并发程序比顺序版本慢？

Question

I was trying to do some analysis on calling start and join simultaneously,我试图对同时调用start和join做一些分析，

    //Starting and Joining 
    for (Thread thread : threadArray) {
        thread.start();
        thread.join();
    }

compared with start first and then join .与先start后join相比。

    //Starting Them
    for (Thread thread : threadArray) {
        thread.start();
    }
    //Joining Them
    for (Thread thread : threadArray) {
        thread.join();
    }

What would be the performance difference between the above two cases?.上述两种情况之间的性能差异是什么？

In the first scenario, I am pretty much guaranteeing that the order of execution is sequential between threads.在第一种情况下，我几乎可以保证线程之间的执行顺序是连续的。 So if I have n threads and say each thread takes Ti time to complete the task, my total time of execution should be sum of Tis from 1 to n.因此，如果我有n线程并说每个线程需要Ti时间来完成任务，那么我的总执行时间应该是Tis从 1 到 n 的总和。

In the second scenario, I am starting off and then joining.在第二种情况下，我开始然后加入。 This is the part which I am getting confused.这是我感到困惑的部分。 Shouldn't the time be almost same as above?.时间不应该和上面差不多吗？ What I am seeing is almost double on my machine.我在我的机器上看到的几乎是两倍。

The entire code sample I am using is given below.我正在使用的整个代码示例如下所示。

public class ThreadJoin implements Runnable {

    public void run() {
        for (int i=0;i<10000000;i++) {
            //Random mathematical stuff independent of i.
             int ran = (int) (Math.random()*1000 -34)%47;
        }
    }

    public static void main(String[] args) throws Exception {
        Thread[] threadArray = new Thread[10];
        //Creating threads and feeding them with the job
        for (int i=0;i<10;i++) {
            threadArray[i] = new Thread(new ThreadJoin());
        }
        long currentTimeMillis = System.currentTimeMillis();
        System.out.println("Started at " + currentTimeMillis);
        //Starting Them
        for (Thread thread : threadArray) {
            thread.start();
        }
        //Joining Them
        for (Thread thread : threadArray) {
            thread.join();
        }
        long currentTimeMillis2 = System.currentTimeMillis();
        System.out.println("Ended at " + currentTimeMillis2);
        System.out.println("Diff : " +( currentTimeMillis2 - currentTimeMillis));
    }

}

Answer 1

In theory, starting first all the threads and then joining them all should allow your 10 threads to execute concurrently (ie at the same time), while starting and joining the threads in one loop will make them run in parallel.理论上，首先启动所有线程然后加入它们应该允许你的 10 个线程同时执行（即同时），而在一个循环中启动和加入线程将使它们并行运行。

So, in theory, the two-loop variant should be faster.因此，理论上，双循环变体应该更快。 Why is it actually slower (if I understand this right)?为什么它实际上更慢（如果我理解这一点）？

You are using Math.random() in your loop quite heavily.您在循环中大量使用Math.random() 。 In fact, I suppose most of the work occurs in this method.事实上，我想大部分工作都发生在这种方法中。 Math.random() is a synchronized method - this means that only one thread at a time can execute it, and the other ones have to wait until the previous one is finished. Math.random() 是一种同步方法——这意味着一次只有一个线程可以执行它，其他线程必须等到前一个线程完成。

So, you can't really get faster than sequentially here.所以，你真的不能比这里的顺序更快。 It actually gets slower since you have lots of context switches between your many threads, most of which will then find out they can't continue since another thread already has the lock.它实际上变慢了，因为您在许多线程之间有很多上下文切换，其中大多数会发现它们无法继续，因为另一个线程已经拥有锁。

To make your program faster, let each thread have its own java.util.Random() object, and call its nextRandom() method instead.为了让你的程序更快，让每个线程都有自己的java.util.Random() object，然后调用它的nextRandom()方法。 (You might want to make sure that they are initialized with different seeds, though.) （不过，您可能希望确保它们使用不同的种子进行初始化。）

As mentioned in the comment from Tomek, from Java 7 on there is the ThreadLocalRandom class, which organizes such a pool of Random objects per thread and exposes the one for the current thread by its current() method.正如 Tomek 的评论中提到的，从 Java 7 开始，有ThreadLocalRandom class，它为每个线程组织这样一个随机对象池，并通过其current()方法为当前线程公开一个对象。 (I did never use this, so I can't comment on the performance compared to doing this manually.) （我从未使用过它，因此与手动执行此操作相比，我无法评论性能。）

Answer 2

The second version (start all threads, then join all threads) allows the threads to run in parallel so it should be faster (shorter total time) on multi-processor or multicore machines.第二个版本（启动所有线程，然后加入所有线程）允许线程并行运行，因此在多处理器或多核机器上应该更快（总时间更短）。 But Math.random() is synchronized which can make it actually slower.但是 Math.random() 是同步的，这实际上会使它变慢。

From the documentation of Math.random():从 Math.random() 的文档中：

This method is properly synchronized to allow correct use by more than one thread.此方法已正确同步，以允许多个线程正确使用。 However, if many threads need to generate pseudorandom numbers at a great rate, it may reduce contention for each thread to have its own pseudorandom-number generator.但是，如果许多线程需要以很高的速率生成伪随机数，则可能会减少每个线程对拥有自己的伪随机数生成器的争用。

Answer 3

When you have a loop which doesn't do anything, the server JIT can detect this eliminate it.当你有一个不做任何事情的循环时，服务器 JIT 可以检测到它并消除它。 The first time you call the loop, there is a small delay before it detects the loop doesn't do anything, but the second time you call it will be much faster.第一次调用循环时，在检测到循环没有做任何事情之前会有一点延迟，但第二次调用它会快得多。

My advice is that you use a Thread pool if you care how long it takes to start and stop threads.我的建议是，如果您关心启动和停止线程需要多长时间，请使用线程池。 It almost eliminates the need to do so and you won't get faster than that.它几乎消除了这样做的需要，而且你不会比这更快。

为什么我的并发程序比顺序版本慢？

问题描述

3 个解决方案

解决方案1
7 已采纳 2011-08-13 19:16:19

解决方案2
5 2011-08-13 19:13:07

解决方案3
3 2011-08-13 18:35:44

为什么我的并发程序比顺序版本慢？

问题描述

3 个解决方案

解决方案1 7 已采纳 2011-08-13 19:16:19

解决方案2 5 2011-08-13 19:13:07

解决方案3 3 2011-08-13 18:35:44

解决方案1
7 已采纳 2011-08-13 19:16:19

解决方案2
5 2011-08-13 19:13:07

解决方案3
3 2011-08-13 18:35:44