为什么`parallelStream` 比`CompletableFuture` 实现更快？

Question

我想在某个操作上提高后端 REST API 的性能，该操作按顺序轮询多个不同的外部 API 并收集它们的响应并将它们全部扁平化为一个响应列表。

最近刚刚了解了CompletableFuture s，我决定CompletableFuture ，并将该解决方案与仅将我的stream更改为parallelStream解决方案进行比较。

这是用于基准测试的代码：

    package com.alithya.platon;

    import java.util.Arrays;
    import java.util.List;
    import java.util.Objects;
    import java.util.concurrent.CompletableFuture;
    import java.util.concurrent.TimeUnit;
    import java.util.stream.Collectors;
    import org.junit.jupiter.api.AfterEach;
    import org.junit.jupiter.api.BeforeEach;
    import org.junit.jupiter.api.Test;


    public class ConcurrentTest {

        static final List<String> REST_APIS =
                Arrays.asList("api1", "api2", "api3", "api4", "api5", "api6", "api7", "api8");
        MyTestUtil myTest = new MyTestUtil();
        long millisBefore; // used to benchmark

        @BeforeEach
        void setUp() {
            millisBefore = System.currentTimeMillis();
        }

        @AfterEach
        void tearDown() {
            System.out.printf("time taken : %.4fs\n",
                    (System.currentTimeMillis() - millisBefore) / 1000d);
        }

        @Test
        void parallelSolution() { // 4s
            var parallel = REST_APIS.parallelStream()
                    .map(api -> myTest.collectOneRestCall())
                    .flatMap(List::stream)
                    .collect(Collectors.toList());

            System.out.println("List of responses: " + parallel.toString());
        }

        @Test
        void futureSolution() throws Exception { // 8s
            var futures = myTest.collectAllResponsesAsync(REST_APIS);

            System.out.println("List of responses: " + futures.get()); // only blocks here
        }

        @Test
        void originalProblem() { // 32s
            var sequential = REST_APIS.stream()
                    .map(api -> myTest.collectOneRestCall())
                    .flatMap(List::stream)
                    .collect(Collectors.toList());

            System.out.println("List of responses: " + sequential.toString());
        }
    }


    class MyTestUtil {

        public static final List<String> RESULTS = Arrays.asList("1", "2", "3", "4");

        List<String> collectOneRestCall() {
            try {
                TimeUnit.SECONDS.sleep(4); // simulating the await of the response
            } catch (Exception io) {
                throw new RuntimeException(io);
            } finally {
                return MyTestUtil.RESULTS; // always return something, for this demonstration
            }
        }

        CompletableFuture<List<String>> collectAllResponsesAsync(List<String> restApiUrlList) {

            /* Collecting the list of all the async requests that build a List<String>. */
            List<CompletableFuture<List<String>>> completableFutures = restApiUrlList.stream()
                    .map(api -> nonBlockingRestCall())
                    .collect(Collectors.toList());

            /* Creating a single Future that contains all the Futures we just created ("flatmap"). */
            CompletableFuture<Void> allFutures = CompletableFuture.allOf(completableFutures
                    .toArray(new CompletableFuture[restApiUrlList.size()]));

            /* When all the Futures have completed, we join them to create merged List<String>. */
            CompletableFuture<List<String>> allCompletableFutures = allFutures
                    .thenApply(future -> completableFutures.stream()
                            .filter(Objects::nonNull) // we filter out the failed calls
                            .map(CompletableFuture::join)
                            .flatMap(List::stream) // creating a List<String> from List<List<String>>
                            .collect(Collectors.toList())
                    );

            return allCompletableFutures;
        }

        private CompletableFuture<List<String>> nonBlockingRestCall() {
            /* Manage the Exceptions here to ensure the wrapping Future returns the other calls. */
            return CompletableFuture.supplyAsync(() -> collectOneRestCall())
                    .exceptionally(ex -> {
                        return null; // gets managed in the wrapping Future
                    });
        }

    }

有一个包含 8 个（假）API 的列表。 每个响应需要 4 秒来执行并返回 4 个实体的列表（为了简单起见，在我们的例子中是字符串）。

结果：

stream ：32 秒
parallelStream ：4 秒
CompletableFuture ：8 秒

我很惊讶，并希望最后两个几乎相同。 究竟是什么导致了这种差异？ 据我所知，他们都使用ForkJoinPool.commonPool() 。

我天真的解释是parallelStream ，因为它是一个阻塞操作，使用实际的MainThread来处理它的工作负载，因此有一个额外的活动线程可以使用，与异步的CompletableFuture相比，因此不能使用MainThread 。

Answer 1

CompletableFuture.supplyAsync()最终将使用ForkJoinPool初始化，其并行性为Runtime.getRuntime().availableProcessors() - 1 （ JDK 11 源代码）

所以看起来你有一台 8 处理器的机器。 因此池中有 7 个线程。

有 8 个 API 调用，因此公共池上一次只能运行 7 个。 对于可完成的期货测试，将有 8 个任务在主线程阻塞的情况下运行，直到它们全部完成。 7 将能够立即执行，这意味着必须等待 4 秒。

parallelStream()也使用相同的线程池，但不同之处在于第一个任务将在执行流终端操作的主线程上执行，剩下 7 个任务将分配到公共池。 因此，在这种情况下，只有足够的线程来并行运行所有内容。 尝试将任务数增加到 9，您将获得 8 秒的测试运行时间。

为什么`parallelStream` 比`CompletableFuture` 实现更快？

问题描述

1 个解决方案

解决方案1
12 已采纳 2019-12-04 06:48:24

为什么`parallelStream` 比`CompletableFuture` 实现更快？

问题描述

1 个解决方案

解决方案1 12 已采纳 2019-12-04 06:48:24

解决方案1
12 已采纳 2019-12-04 06:48:24