简体   繁体   English

适当的基准?

[英]A proper benchmark?

I wanted to measure, how long 2 different programs need to perform 1 task. 我想测量,2个不同的程序需要多长时间才能执行1个任务。 One program used threads, the other didn't.The task was to count up to 2000000. 一个程序使用线程,另一个没有。任务是计数高达2000000。

Class with threads: 线程的类:

public class Main {
    private int res1 = 0;
    private int res2 = 0;

    public static void main(String[] args) {
        Main m = new Main();

        long startTime = System.nanoTime();
        m.func();
        long endTime = System.nanoTime();

        long duration = endTime - startTime;
        System.out.println("duration: " + duration);
    }

    public void func() {
        Thread t1 = new Thread(new Runnable() {

            @Override
            public void run() {
                for (int i = 0; i < 1000000; i++) {
                    res1++;
                }
            }
        });

        Thread t2 = new Thread(new Runnable() {

            @Override
            public void run() {
                for (int i = 1000000; i < 2000000; i++) {
                    res2++;
                }
            }
        });

        t1.start();
        t2.start();

        System.out.println(res1 + res2);
    }
}

Class without threads: 没有线程的类:

public class Main {

    private int res = 0;

    public static void main(String[] args) {
        Main m = new Main();

        long startTime = System.nanoTime();
        m.func();
        long endTime = System.nanoTime();

        long duration = endTime - startTime;
        System.out.println("duration: " + duration);

    }

    public void func() {

        for (int i = 0; i < 2000000; i++) {
            res++;
        }
        System.out.println(res);
    }
}

After 10 measurement the average results (in nanoseconds) were: 10次​​测量后,平均结果(以纳秒为单位)为:

With threads:    1952358
Without threads: 7941479

Am I doing it right? 我做得对吗?
How come, with 2 threads it's 4x faster and not only 2x? 怎么来,2线程它快4倍而不仅仅2倍?

In the lines 在线

    t1.start();
    t2.start();

you are starting the thread execution, but you aren't actually waiting for them to finish before you take your time measurement. 您正在开始执行线程,但在进行时间测量之前,您实际上并没有等待它们完成。 To wait until the threads are finished, call 要等到线程完成,请调用

   t1.join();
   t2.join();

The join method will block until the thread is finished. join方法将阻塞,直到线程完成。 Then measure the execution time. 然后测量执行时间。

In parallel version you are measuring how much main thread creates the other two threads. 在并行版本中,您将测量主线程创建其他两个线程的数量。 You are not measuring their execution times. 您没有测量他们的执行时间。 That is why you are getting super-linear speedup. 这就是你获得超线性加速的原因。 In order to include their execution times you have to join them with the main thread. 为了包含它们的执行时间,你必须将它们与主线程连接起来。

Add these lines after t2.start(); t2.start();之后添加这些行t2.start();

     t1.join();  // wait until thread t1 terminates
     t2.join(); // wait until thread t2 terminates

The main reason the multi-thread version is faster is that you don't wait for the loop to finish. 多线程版本更快的主要原因是您不等待循环完成。 You only wait for the threads to start. 您只需等待线程启动。

You need to add after start(); 你需要在start()之后添加;

    t1.join();
    t2.join();

Once you do this you note that starting the threads takes so long at it's quite a bit slower. 一旦你这样做,你会注意到启动线程需要很长时间,因为它的速度要慢得多。 If you make your test 100x longer, the cost of starting the threads is not so important. 如果你的测试时间延长100倍,启动线程的成本就不那么重要了。

The single threaded example takes longer to be JItted properly. 单线程示例需要更长时间才能正确进行JItted。 You need to make sure you run the test for at least 2 seconds, repeatedly 您需要确保重复运行测试至少2秒

My multiple threaded version is 我的多线程版本是

public class Main {
    private long res1 = 0;
    public long p0, p1, p2, p3, p4, p5, p6, p7;
    private long res2 = 0;

    public static void main(String[] args) throws InterruptedException {
        Main m = new Main();

        for (int i = 0; i < 10; i++) {
            long startTime = System.nanoTime();
            m.func();
            long endTime = System.nanoTime();

            long duration = endTime - startTime;
            System.out.println("duration: " + duration);
        }
        assert m.p0 + m.p1 + m.p2 + m.p3 + m.p4 + m.p5 + m.p6 + m.p7 == 0;
    }

    public void func() throws InterruptedException {
        Thread t1 = new Thread(new Runnable() {
            @Override
            public void run() {
                for (int i = 0; i < 1000000000; i++) {
                    res1++;
                }
            }
        });

        Thread t2 = new Thread(new Runnable() {
            @Override
            public void run() {
                for (int i = 1000000000; i < 2000000000; i++) {
                    res2++;
                }
            }
        });

        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println(res1 + res2);
    }
}

prints the following for multi-threaded tests. 打印以下内容以进行多线程测试。

2000000000
duration: 179014396
4000000000
duration: 148814805
.. deleted ..
18000000000
duration: 61767861
20000000000
duration: 72396259

For the single threaded version I comment out one thread and get 对于单线程版本,我注释掉一个线程并获取

2000000000
duration: 266228421
4000000000
duration: 255203050
... deleted ...
18000000000
duration: 125434383
20000000000
duration: 125230354

As expected, when run long enough two threads are almost twice as fast as one. 正如预期的那样,当运行足够长时,两个线程几乎是一个线程的两倍。

In short, 简而言之,

  • multi-threaded code can have smaller delays for the current thread if you don't wait for those operation to complete eg asynchronous logging and messaging. 如果您不等待这些操作完成,例如异步日志记录和消息传递,则多线程代码可以对当前线程具有较小的延迟。

  • single threaded coding can be much faster (and simpler) than multi-threaded code unless you have a significant CPU bound tasks to perform (or you can do concurrent IO) 单线程编码可以比多线程代码快得多(并且更简单),除非你有一个重要的CPU绑定任务要执行(或者你可以做并发IO)

  • Running the test repeatedly in the same JVM can give different results 在同一JVM中重复运行测试可能会产生不同的结果

There are a couple of tricks you need to remember when benchmarking in java. 在java中进行基准测试时,需要记住几个技巧。

The first this is the same when benchmarking anything: one run may just happen to be slower than another, for no meaningful reason. 首先这是当标杆什么是相同的:一个运行可能恰好是比其他慢,因为没有意义的原因。 To avoid this, run multiple times and take an average (and I mean lots of times). 为避免这种情况,请多次运行并取平均值(我的意思是很多次)。

The second may not be unique to java, but might be surprising: java VMs can take time to "warm up" - if you run your code a hundred times, the compiled code can change according to what code paths are extremely common. 第二个可能不是唯一到Java,但可能有些奇怪:JAVA虚拟机可能需要一段时间“热身” -如果你运行你的代码百倍,编译后的代码可以change根据什么代码路径是非常普遍的。 To battle this, run the code many times before you start taking stats . 为了解决这个问题,请在开始统计数据之前多次运行代码。

How long it takes to warm up depends on your JVM settings - I can't quite remember off the top of my head. 预热需要多长时间取决于你的JVM设置 - 我不记得我的头脑。

This is, of course, quite apart from the problem that the other answers have pointed out that you're not actually measuring the threaded program. 当然,这与其他答案指出你实际上并没有测量线程程序的问题完全不同。

EDIT: Another thing to be careful of is the compiler realising that any particular variable/loop/entire program is completely pointless. 编辑:另外要注意的是编译器意识到任何特定的变量/循环/整个程序是完全没有意义的。 In these situations it's likely to just completely delete it - you might find that you need to use res1 and res2 or else your loops may be completely removed from the compiled code. 在这些情况下,它可能只是完全删除它 - 您可能会发现需要使用res1res2 ,否则您的循环可能会从编译的代码中完全删除。

EDIT: Just realized that you do actually use all of your counting variables - it's still a useful thing to know, though, so I'll leave it in. 编辑:刚刚意识到你确实使用了所有的计数变量 - 但是,知道它仍然是一个有用的东西,所以我会留下它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM