简体   繁体   English

并发执行:Future vs parallelstream

[英]Concurrent Execution: Future vs parallelstream

I wrote a callable that polls a remote client for information and returns that information in List form. 我编写了一个可调用的函数来轮询远程客户端以获取信息,并以List形式返回该信息。 I'm using a threadpoolexecutor, for loop, and Future to execute the task in parallel against multiple remote clients. 我正在使用threadpoolexecutor,for循环,以及Future与多个远程客户端并行执行任务。 Then I combine all of the Future lists with addAll() and work with the giant combined list. 然后我将所有Future列表与addAll()组合在一起,并使用巨型组合列表。

My question is, would using parallelstream() be more efficient here than using future and a for loop? 我的问题是,使用parallelstream()会比使用future和for循环更有效吗? It's certainly easier to code! 编码肯定更容易! If I went that route would I need the threadpoolexecutor anymore? 如果我走那条路,我会不再需要threadpoolexecutor?

Thank you! 谢谢!

        for(SiteInfo site : active_sites) {
            TAG_SCANNER scanr = new TAG_SCANNER(site, loggr);
            Future<List<TagInfo>> result = threadmaker.submit(scanr);

            //SOUND THE ALARMS
            try {
                alarm_tags.addAll(result.get());
            } catch (InterruptedException | ExecutionException e) {
                e.printStackTrace();
            }
        }

Possible solution code? 可能的解决方案代 Netbeans is suggesting something along these lines Netbeans在这些方面提出了一些建议

active_sites.parallelstream().map((site) -> new TAG_SCANNER(site, loggr)).map((scanr) -> threadmaker.submit(scanr)).forEach((result) -> {
            //SOUND THE ALARMS
            try {
                alarm_tags.addAll(result.get());
            }
            catch (InterruptedException | ExecutionException e) {
                e.printStackTrace();
            }
        });

There are several misconceptions here. 这里有几个误解。 First, using an asynchronous task does not improve your resource utilization, if you call Future.get right after submitting the task, immediately waiting for its completion before submitting the next task. 首先,如果在提交任务后立即调用Future.get ,则使用异步任务不会提高资源利用率,在提交下一个任务之前立即等待其完成。

Second, the code transformation made by Netbeans produced a mostly equivalent code, still submitting tasks to an Executor so it's not a matter of “Future vs parallelstream” as you are only performing the submission (and waiting) with the parallel stream and still using the executor. 其次,Netbeans进行的代码转换产生了一个大致相同的代码,仍然向Executor提交任务,所以它不是“Future vs parallelstream”的问题,因为你只是使用并行流执行提交(和等待)并仍然使用遗嘱执行人。 Due to your first error, doing it in parallel might improve the throughput, but besides that it is never a good idea to combine two mistakes to let them cancel themselves out, it's still a poor solution: 由于你的第一个错误,并行执行可能会提高吞吐量,但除此之外,将两个错误结合起来让它们自行取消并不是一个好主意,它仍然是一个糟糕的解决方案:

The standard implementation of the Stream API is optimized for CPU-bound tasks, creating a number of threads matching the number of CPU cores and not spawning new threads when these threads get blocked in a wait operation. Stream API的标准实现针对CPU绑定任务进行了优化,创建了许多与CPU核心数匹配的线程,并且当这些线程在等待操作中被阻塞时不生成新线程。 So using parallel streams for performing I/O operations, or generally operations which may wait, is not a good choice. 因此,使用并行流执行I / O操作,或者通常可以等待的操作,不是一个好的选择。 And you have no control over the threads used by the implementation. 而且您无法控制实现使用的线程。

The better choice is staying with the ExecutorService which you can configure according to your expected I/O bandwidth to your remote clients. 更好的选择是使用ExecutorService ,您可以根据预期的远程客户端I / O带宽进行配置。 But you should fix the error of waiting immediately after submission, submitting all tasks first and waiting for the completion of all tasks afterwards. 但是你应该在提交后立即修复等待的错误,首先提交所有任务,然后等待所有任务完成。 Note that you can use the stream API for that, not for better parallelism, but potentially improving the readability: 请注意,您可以使用流API,而不是为了更好的并行性,但可能会提高可读性:

// first, submit all tasks, assuming "threadmaker" is an ExecutorService
List<Future<List<TagInfo>>> futures=threadmaker.invokeAll(
    active_sites.stream()
        .map(site -> new TAG_SCANNER(site, loggr))
        .collect(Collectors.toList())
);
// now fetch all results
for(Future<List<TagInfo>> result: futures) {
    //SOUND THE ALARMS
    try {
        alarm_tags.addAll(result.get());
    } catch (InterruptedException | ExecutionException e) {
        // not a recommended way of handling
        // but I keep your code here for simplicity
        e.printStackTrace();
    }
}

Note that the stream API use here is sequential and only for converting your list of SiteInfo to a list of Callable<List<TagInfo>> , but you could do the same using a loop. 请注意,此处使用的流API是顺序的 ,仅用于将SiteInfo列表转换为Callable<List<TagInfo>> ,但您可以使用循环执行相同操作。

In general parallelstream has been written by very smart programmers to do parallel processing very effectively. 一般来说parallelstream已经写的非常聪明的程序员非常有效地做并行处理。

With that, as with all the other java threading such as the concurrency package then unless you are an expert in the subject then if you write it yourself you are likely to: 有了它,就像所有其他java线程一样,例如并发包,那么除非你是这个主题的专家,否则如果你自己编写它你可能会:

  • Run slower 跑得慢一点
  • Introduce bugs 介绍错误
  • Have more complex/harder to follow/etc code 有更复杂/更难遵循/ etc代码

In other words: Yes, use parallelstream . 换句话说: 是的,使用parallelstream

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM