Java 8：流與 Collections 的性能

Question

I'm new to Java 8. I still don't know the API in depth, but I've made a small informal benchmark to compare the performance of the new Streams API vs the good old Collections.

該測試包括過濾Integer列表，並為每個偶數計算平方根並將其存儲在結果List中Double 。

這是代碼：

    public static void main(String[] args) {
        //Calculating square root of even numbers from 1 to N       
        int min = 1;
        int max = 1000000;

        List<Integer> sourceList = new ArrayList<>();
        for (int i = min; i < max; i++) {
            sourceList.add(i);
        }

        List<Double> result = new LinkedList<>();


        //Collections approach
        long t0 = System.nanoTime();
        long elapsed = 0;
        for (Integer i : sourceList) {
            if(i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Collections: Elapsed time:\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Stream approach
        Stream<Integer> stream = sourceList.stream();       
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Streams: Elapsed time:\t\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Parallel stream approach
        stream = sourceList.stream().parallel();        
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Parallel streams: Elapsed time:\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));      
    }.

以下是雙核機器的結果：

    Collections: Elapsed time:        94338247 ns   (0,094338 seconds)
    Streams: Elapsed time:           201112924 ns   (0,201113 seconds)
    Parallel streams: Elapsed time:  357243629 ns   (0,357244 seconds)

對於這個特定的測試，流的速度大約是 collections 的兩倍，並且並行性沒有幫助（或者我使用它的方式錯誤？）。

問題：

這個考試公平嗎？ 我犯了什么錯誤嗎？
流是否比 collections 慢？ 有沒有人對此做出良好的正式基准？
我應該爭取哪種方法？

更新結果。

按照@pveentjer 的建議，我在 JVM 預熱（1k 次迭代）之后運行了 1k 次測試：

    Collections: Average time:      206884437,000000 ns     (0,206884 seconds)
    Streams: Average time:           98366725,000000 ns     (0,098367 seconds)
    Parallel streams: Average time: 167703705,000000 ns     (0,167704 seconds)

在這種情況下，流的性能更高。 我想知道在運行時僅調用一次或兩次過濾 function 的應用程序中會觀察到什么。

Answer 1

除了使用迭代器從列表中間大量刪除之外，停止使用LinkedList 。
停止手動編寫基准測試代碼，使用JMH 。

適當的基准：

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(StreamVsVanilla.N)
public class StreamVsVanilla {
    public static final int N = 10000;

    static List<Integer> sourceList = new ArrayList<>();
    static {
        for (int i = 0; i < N; i++) {
            sourceList.add(i);
        }
    }

    @Benchmark
    public List<Double> vanilla() {
        List<Double> result = new ArrayList<>(sourceList.size() / 2 + 1);
        for (Integer i : sourceList) {
            if (i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        return result;
    }

    @Benchmark
    public List<Double> stream() {
        return sourceList.stream()
                .filter(i -> i % 2 == 0)
                .map(Math::sqrt)
                .collect(Collectors.toCollection(
                    () -> new ArrayList<>(sourceList.size() / 2 + 1)));
    }
}

結果：

Benchmark                   Mode   Samples         Mean   Mean error    Units
StreamVsVanilla.stream      avgt        10       17.588        0.230    ns/op
StreamVsVanilla.vanilla     avgt        10       10.796        0.063    ns/op

正如我預期的那樣，流實現相當慢。 JIT 能夠內聯所有 lambda 內容，但不會產生像 vanilla 版本那樣完美簡潔的代碼。

通常，Java 8 流並不神奇。 他們無法加速已經很好實現的東西（可能是簡單的迭代或 Java 5 的 for-each 語句被Iterable.forEach()和Collection.removeIf()調用取代）。 流更多的是關於編碼的便利性和安全性。 方便——速度權衡在這里起作用。

Answer 2

1) 使用基准測試，您看到的時間少於 1 秒。 這意味着副作用可能會對您的結果產生很大影響。 所以，我把你的任務增加了 10 倍

    int max = 10_000_000;

並運行您的基准測試。 我的結果：

Collections: Elapsed time:   8592999350 ns  (8.592999 seconds)
Streams: Elapsed time:       2068208058 ns  (2.068208 seconds)
Parallel streams: Elapsed time:  7186967071 ns  (7.186967 seconds)

沒有編輯（ int max = 1_000_000 ）結果是

Collections: Elapsed time:   113373057 ns   (0.113373 seconds)
Streams: Elapsed time:       135570440 ns   (0.135570 seconds)
Parallel streams: Elapsed time:  104091980 ns   (0.104092 seconds)

這就像你的結果：流比收集慢。 結論：流初始化/值傳輸花費了很多時間。

2）增加任務流后變得更快（沒關系），但並行流仍然太慢。 怎么了？ 注意：你的命令中有collect(Collectors.toList()) 。 收集到單個收集本質上會在並發執行的情況下引入性能瓶頸和開銷。 可以通過替換來估計間接費用的相對成本

collecting to collection -> counting the element count

對於流，它可以通過collect(Collectors.counting()) 。 我得到了結果：

Collections: Elapsed time:   41856183 ns    (0.041856 seconds)
Streams: Elapsed time:       546590322 ns   (0.546590 seconds)
Parallel streams: Elapsed time:  1540051478 ns  (1.540051 seconds)

這是一項艱巨的任務！ ( int max = 10000000 )結論：收集物品花費了大部分時間。 最慢的部分是添加到列表中。 順便說一句，簡單的ArrayList用於Collectors.toList() 。

Answer 3

    public static void main(String[] args) {
    //Calculating square root of even numbers from 1 to N       
    int min = 1;
    int max = 10000000;

    List<Integer> sourceList = new ArrayList<>();
    for (int i = min; i < max; i++) {
        sourceList.add(i);
    }

    List<Double> result = new LinkedList<>();


    //Collections approach
    long t0 = System.nanoTime();
    long elapsed = 0;
    for (Integer i : sourceList) {
        if(i % 2 == 0){
            result.add( doSomeCalculate(i));
        }
    }
    elapsed = System.nanoTime() - t0;       
    System.out.printf("Collections: Elapsed time:\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


    //Stream approach
    Stream<Integer> stream = sourceList.stream();       
    t0 = System.nanoTime();
    result = stream.filter(i -> i%2 == 0).map(i -> doSomeCalculate(i))
            .collect(Collectors.toList());
    elapsed = System.nanoTime() - t0;       
    System.out.printf("Streams: Elapsed time:\t\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


    //Parallel stream approach
    stream = sourceList.stream().parallel();        
    t0 = System.nanoTime();
    result = stream.filter(i -> i%2 == 0).map(i ->  doSomeCalculate(i))
            .collect(Collectors.toList());
    elapsed = System.nanoTime() - t0;       
    System.out.printf("Parallel streams: Elapsed time:\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));      
}

static double doSomeCalculate(int input) {
    for(int i=0; i<100000; i++){
        Math.sqrt(i+input);
    }
    return Math.sqrt(input);
}

我稍微更改了代碼，在我的 8 核 mac book pro 上運行，我得到了一個合理的結果：

Collections: Elapsed time:      1522036826 ns   (1.522037 seconds)
Streams: Elapsed time:          4315833719 ns   (4.315834 seconds)
Parallel streams: Elapsed time:  261152901 ns   (0.261153 seconds)

Answer 4

對於您要執行的操作，無論如何我都不會使用常規的 Java api。 有大量的裝箱/拆箱正在進行，因此存在巨大的性能開銷。

我個人認為很多 API 設計都是垃圾，因為它們會產生大量的對象垃圾。

嘗試使用 double/int 的原始數組並嘗試單線程進行，看看性能如何。

PS：您可能想看看 JMH 來處理基准測試。 它處理了一些典型的陷阱，比如預熱 JVM。

Answer 5

Java 8 和 Java 11 的有趣結果。我使用了由 leventov 提供的代碼，幾乎沒有修改：

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(BenchmarkMain.N)
public class BenchmarkMain {

    public static final int N = 10000;

    static List<Integer> sourceList = new ArrayList<>();
    static {
        for (int i = 0; i < N; i++) {
            sourceList.add(i);
        }
    }

    @Benchmark
    public List<Double> vanilla() {
        List<Double> result = new ArrayList<>(sourceList.size() / 2 + 1);
        for (Integer i : sourceList) {
            if (i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        return result;
    }

    @Benchmark
    public List<Double> stream() {
        return sourceList.stream()
                .filter(i -> i % 2 == 0)
                .map(Math::sqrt)
                .collect(Collectors.toCollection(
                    () -> new ArrayList<>(sourceList.size() / 2 + 1)));
    }

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws IOException {
        org.openjdk.jmh.Main.main(args);

    }

}

爪哇 8：

# JMH version: 1.31
# VM version: JDK 1.8.0_262, OpenJDK 64-Bit Server VM, 25.262-b19
# VM invoker: /opt/jdk1.8.0_262/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
...
Benchmark              Mode  Cnt   Score   Error  Units
BenchmarkMain.stream   avgt   25  10.680 ± 0.744  ns/op
BenchmarkMain.vanilla  avgt   25   6.490 ± 0.159  ns/op

爪哇11：

# JMH version: 1.31
# VM version: JDK 11.0.2, OpenJDK 64-Bit Server VM, 11.0.2+9
# VM invoker: /opt/jdk-11.0.2/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
...
Benchmark              Mode  Cnt  Score   Error  Units
BenchmarkMain.stream   avgt   25  5.521 ± 0.057  ns/op
BenchmarkMain.vanilla  avgt   25  7.359 ± 0.118  ns/op

Answer 6

二手 Java 17 我的結果

Collections: Elapsed time:109585000 ns  (0.109585 seconds)
Streams: Elapsed time:42179700 ns   (0.042180 seconds)
Parallel streams: Elapsed time:76177100 ns  (0.076177 seconds)

而不是 LinkedList 使用List.of結果改變

Collections: Elapsed time:49681300 ns   (0.049681 seconds)
Streams: Elapsed time:38930300 ns   (0.038930 seconds)
Parallel streams: Elapsed time:49190500 ns  (0.049191 seconds)

Java 8：流與 Collections 的性能

問題描述

6 個解決方案

解決方案1
212 2014-03-26 18:48:06

解決方案2
18 2014-03-26 11:43:45

解決方案3
5 2015-11-19 19:31:12

解決方案4
4 2014-03-26 10:41:59

解決方案5
1 2021-06-01 09:07:22

解決方案6
0 2022-08-23 02:27:51

Java 8：流與 Collections 的性能

問題描述

6 個解決方案

解決方案1 212 2014-03-26 18:48:06

解決方案2 18 2014-03-26 11:43:45

解決方案3 5 2015-11-19 19:31:12

解決方案4 4 2014-03-26 10:41:59

解決方案5 1 2021-06-01 09:07:22

解決方案6 0 2022-08-23 02:27:51

解決方案1
212 2014-03-26 18:48:06

解決方案2
18 2014-03-26 11:43:45

解決方案3
5 2015-11-19 19:31:12

解決方案4
4 2014-03-26 10:41:59

解決方案5
1 2021-06-01 09:07:22

解決方案6
0 2022-08-23 02:27:51