我應該如何用溪流來總結一下？

Question

我已經看過並嘗試過如何在流中對某些內容求和的不同實現。 這是我的代碼：

List<Person> persons = new ArrayList<Person>();

for(int i=0; i < 10000000; i++){
    persons.add(new Person("random", 26));
}

Long start = System.currentTimeMillis();
int test = persons.stream().collect(Collectors.summingInt(p -> p.getAge()));
Long end = System.currentTimeMillis();
System.out.println("Sum of ages = " + test + " and it took : " + (end - start) + " ms with collectors");

Long start3 = System.currentTimeMillis();
int test3 = persons.parallelStream().collect(Collectors.summingInt(p -> p.getAge()));
Long end3 = System.currentTimeMillis();
System.out.println("Sum of ages = " + test3 + " and it took : " + (end3 - start3) + " ms with collectors and parallel stream");


Long start2 = System.currentTimeMillis();
int test2 = persons.stream().mapToInt(p -> p.getAge()).sum();
Long end2 = System.currentTimeMillis();
System.out.println("Sum of ages = " + test2 + " and it took : " + (end2 - start2) + " ms with map and sum");

Long start4 = System.currentTimeMillis();
int test4 = persons.parallelStream().mapToInt(p -> p.getAge()).sum();
Long end4 = System.currentTimeMillis();
System.out.println("Sum of ages = " + test4 + " and it took : " + (end4 - start4) + " ms with map and sum and parallel stream");

這給了我以下結果：

Sum of ages = 220000000 and it took : 110 ms with collectors
Sum of ages = 220000000 and it took : 272 ms with collectors and parallel stream
Sum of ages = 220000000 and it took : 137 ms with map and sum
Sum of ages = 220000000 and it took : 134 ms with map and sum and parallel stream

我嘗試了幾次並且每次給我不同的結果（大多數時候最后的解決方案是最好的），所以我想知道：

1）正確的方法是什么？

2）為什么？ （與其他解決方案有什么區別？）

Answer 1

在我們進入實際答案之前，您應該了解一些事項：

您的測試結果可能會有很大差異，具體取決於許多因素（例如您運行它的計算機）。 以下是我的8核機器上運行的結果：

 Sum of ages = 260000000 and it took : 94 ms with collectors Sum of ages = 260000000 and it took : 61 ms with collectors and parallel stream Sum of ages = 260000000 and it took : 70 ms with map and sum Sum of ages = 260000000 and it took : 94 ms with map and sum and parallel stream

然后在以后的運行中：

 Sum of ages = 260000000 and it took : 68 ms with collectors Sum of ages = 260000000 and it took : 67 ms with collectors and parallel stream Sum of ages = 260000000 and it took : 66 ms with map and sum Sum of ages = 260000000 and it took : 109 ms with map and sum and parallel stream

微基准測試不是一個簡單的主題。 有方法可以做到這一點（稍后我會介紹一些），但在大多數情況下，只是嘗試使用System.currentTimeMillies()將無法可靠地工作。
僅僅因為Java 8使並行操作變得容易，這並不意味着它們應該在任何地方使用。 並行操作在某些情況下有意義，在其他情況下則不然。

好的，現在讓我們來看看你正在使用的各種方法。

順序收集器：您使用的summingInt收集器具有以下實現：
```
 public static <T> Collector<T, ?, Integer> summingInt(ToIntFunction<? super T> mapper) { return new CollectorImpl<>( () -> new int[1], (a, t) -> { a[0] += mapper.applyAsInt(t); }, (a, b) -> { a[0] += b[0]; return a; }, a -> a[0], Collections.emptySet()); } 
```
因此，首先將創建一個包含一個元素的新數組。 然后，對於流中的每個Person元素， collect函數將使用Person#getAge()函數將age作為Integer （而不是int ！）檢索，並將該age添加到之前的（在1D-array中）。 最后，當處理完整個流時，它將從該數組中提取值並返回它。 所以，這里有很多自動裝箱和裝箱。
並行收集器：它使用ReferencePipeline#forEach(Consumer)函數來累積從映射函數獲得的年齡。 再次有很多自動裝箱和-unboxing。
順序映射和求和：在此將Stream<Person>映射到IntStream 。 這意味着一件事就是不再需要自動裝箱或裝箱了; 在某些情況下，這可以節省大量時間。 然后使用以下實現對結果流求和：
```
 @Override public final int sum() { return reduce(0, Integer::sum); } 
```
這里的reduce函數將調用ReduceOps#ReduceOp#evaluateSequential(PipelineHelper<T> helper, Spliterator<P_IN> spliterator) 。 實質上，這將對所有數字使用Integer::sum函數，從0開始，第一個數字，然后是第二個數字的結果，依此類推。
並行映射和求和：這里的事情變得有趣。 它使用相同的sum()函數，但是在這種情況下，reduce將調用ReduceOps#ReduceOp#evaluateParallel(PipelineHelper<T> helper, Spliterator<P_IN> spliterator)而不是順序選項。 這將基本上使用分而治之的方法來累加值。 現在，分而治之的巨大優勢當然是它可以很容易地並行完成。 但是，它確實需要多次拆分和重新連接流，這需要花費時間。 因此它的速度變化很大，取決於它與元素有關的實際任務的復雜性。 在添加的情況下，在大多數情況下可能不值得; 正如你從我的結果中看到的那樣，它總是一種較慢的方法。

現在，為了真正了解所需的時間，讓我們做一個適當的微觀基准測試。 我將使用JMH以下基准代碼：

package com.stackoverflow.user2352924;

import org.openjdk.jmh.annotations.*;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MINUTES)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 10, time = 10, timeUnit = TimeUnit.SECONDS)
@State(Scope.Benchmark)
@Fork(1)
@Threads(2)
public class MicroBenchmark {

    private static List<Person> persons = new ArrayList<>();

    private int test;

    static {
        for(int i=0; i < 10000000; i++){
            persons.add(new Person("random", 26));
        }
    }

    @Benchmark
    public void sequentialCollectors() {
        test = 0;
        test += persons.stream().collect(Collectors.summingInt(p -> p.getAge()));
    }

    @Benchmark
    public void parallelCollectors() {
        test = 0;
        test += persons.parallelStream().collect(Collectors.summingInt(p -> p.getAge()));
    }

    @Benchmark
    public void sequentialMapSum() {
        test = 0;
        test += persons.stream().mapToInt(p -> p.getAge()).sum();
    }

    @Benchmark
    public void parallelMapSum() {
        test = 0;
        test += persons.parallelStream().mapToInt(p -> p.getAge()).sum();
    }

}

這個maven項目的pom.xml如下所示：

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.stackoverflow.user2352924</groupId>
    <artifactId>StackOverflow</artifactId>
    <version>1.0</version>
    <packaging>jar</packaging>

    <name>Auto-generated JMH benchmark</name>

    <prerequisites>
        <maven>3.0</maven>
    </prerequisites>

    <dependencies>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-core</artifactId>
            <version>${jmh.version}</version>
        </dependency>
        <dependency>
            <groupId>org.openjdk.jmh</groupId>
            <artifactId>jmh-generator-annprocess</artifactId>
            <version>${jmh.version}</version>
            <scope>provided</scope>
        </dependency>
    </dependencies>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <jmh.version>0.9.5</jmh.version>
        <javac.target>1.8</javac.target>
        <uberjar.name>benchmarks</uberjar.name>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.1</version>
                <configuration>
                    <compilerVersion>${javac.target}</compilerVersion>
                    <source>${javac.target}</source>
                    <target>${javac.target}</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.2</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <finalName>microbenchmarks</finalName>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>org.openjdk.jmh.Main</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
        <pluginManagement>
            <plugins>
                <plugin>
                    <artifactId>maven-clean-plugin</artifactId>
                    <version>2.5</version>
                </plugin>
                <plugin>
                    <artifactId>maven-deploy-plugin</artifactId>
                    <version>2.8.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-install-plugin</artifactId>
                    <version>2.5.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-jar-plugin</artifactId>
                    <version>2.4</version>
                </plugin>
                <plugin>
                    <artifactId>maven-javadoc-plugin</artifactId>
                    <version>2.9.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-resources-plugin</artifactId>
                    <version>2.6</version>
                </plugin>
                <plugin>
                    <artifactId>maven-site-plugin</artifactId>
                    <version>3.3</version>
                </plugin>
                <plugin>
                    <artifactId>maven-source-plugin</artifactId>
                    <version>2.2.1</version>
                </plugin>
                <plugin>
                    <artifactId>maven-surefire-plugin</artifactId>
                    <version>2.17</version>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>

</project>

確保Maven也在運行Java 8，否則你會遇到難看的錯誤。

我不會在這里詳細介紹如何使用JMH（還有其他地方可以這樣做），但這是我得到的結果：

# Run complete. Total time: 00:08:48

Benchmark                                     Mode  Samples     Score  Score error    Units
c.s.u.MicroBenchmark.parallelCollectors      thrpt       10  3658,949      775,115  ops/min
c.s.u.MicroBenchmark.parallelMapSum          thrpt       10  2616,905      221,109  ops/min
c.s.u.MicroBenchmark.sequentialCollectors    thrpt       10  5502,160      439,024  ops/min
c.s.u.MicroBenchmark.sequentialMapSum        thrpt       10  6120,162      609,232  ops/min

因此，在我運行這些測試的系統上，順序映射總和相當快，在並行映射總和（使用分而治之的方法）設法僅執行超過2600時，管理超過6100次操作事實上，順序方法都比並行方法快得多。

現在，在一個可以更容易並行運行的情況下 - 例如， Person#getAge()函數比只是一個getter復雜得多 - 並行方法可能是一個更好的解決方案。 最后，這一切都取決於被測試案例中並行運行的效率。

另一件需要記住的事情是：如果有疑問，請做一個適當的微觀基准。 ;-)

我應該如何用溪流來總結一下？

問題描述

1 個解決方案

解決方案1
11 已采納 2014-08-14 22:35:36

我應該如何用溪流來總結一下？

問題描述

1 個解決方案

解決方案1 11 已采納 2014-08-14 22:35:36

解決方案1
11 已采納 2014-08-14 22:35:36