简体   繁体   English

使用Lambda函数进行Java并行化

[英]Java parallelization using lambda functions

I have an array of some objects with the method process() that I want to run parallelized. 我有一些对象数组与要并行运行的process()方法一起使用。 And I wanted to try lambdas to achieve the parallelization. 我想尝试使用lambda来实现并行化。 So I tried this: 所以我尝试了这个:

Arrays.asList(myArrayOfItems).forEach(item->{
    System.out.println("processing " + item.getId());
    item.process();
});

Each process() call takes about 2 seconds. 每个process()调用大约需要2秒钟。 And I have noticed that there is still no speedup with the "parallelization" approach. 而且我注意到,“并行化”方法仍然没有加速。 It seems that everything is still running serialized. 似乎一切仍在运行序列化。 The ids are printed in series (ordered) and between every print there is a pause of 2 seconds. ID按顺序(有序)打印,每次打印之间有2秒的暂停。

Probably I have misunderstood something. 可能我误会了一些东西。 What is needed to execute this in parallel using lambdas (hopefully in a very condensed way)? 使用lambda并行执行此操作需要什么(希望以一种非常简洁的方式)?

Lambdas itself aren't executing anything in parallel. Lambdas本身并未并行执行任何操作。 Stream s are capable of doing this though. 尽管Stream可以做到这一点。

Take a look at the method Collection#parallelStream ( documentation ): 看看方法Collection#parallelStreamdocumentation ):

Arrays.asList(myArrayOfItems).parallelStream().forEach(...);

However, note that there is no guarantee or control when it will actually go parallel. 但是,请注意,不能保证或控制其实际何时并行运行。 From its documentation: 从其文档中:

Returns a possibly parallel Stream with this collection as its source. 返回与此集合为源的可能并行的 Stream。 It is allowable for this method to return a sequential stream . 此方法允许 返回顺序流

The reason is simple. 原因很简单。 You really need a lot of elements in your collection (like millions) for parallelization to actually pay off (or doing other heavy things). 实际上,您的集合中确实需要很多元素(例如数百万个),以进行并行化才能真正获得回报(或完成其他繁重的工作)。 The overhead introduced with parallelization is huge . 并行化带来的开销是巨大的 Because of that, the method might choose to use sequential stream instead, if it thinks that it will be faster. 因此,如果认为更快,该方法可能会选择使用顺序流。

Before you think about using parallelism, you should actually setup some benchmarks to test if it improves anything. 在考虑使用并行性之前,您实际上应该设置一些基准测试以测试其是否有所改善。 There are many examples where people did just blindly use it without noticing that they actually decreased the perfomance. 在很多例子中,人们只是盲目的使用它而没有注意到他们实际上降低了性能。 Also see Should I always use a parallel stream when possible? 另请参阅在可能的情况下是否应始终使用并行流? .


You can check if a Stream is parallel by using Stream#isParallel ( documentation ). 您可以使用Stream#isParallel文档 )检查Stream是否并行。

If you use Stream#parallel ( documentation ) directly on a stream, you get a parallel version. 如果直接在流上使用Stream#parallel文档 ),则会得到并行版本。

Method Collection.forEach() is just iteration through all the elements. 方法Collection.forEach()只是对所有元素的迭代。 It is called internal iteration as it leaves up to the collection how it will iterate, but it is still an iteration on all the elements. 它被称为内部迭代,因为它取决于集合如何迭代,但是它仍然是所有元素上的迭代。

If you want parallel processing, you have to: 如果要并行处理,则必须:

  1. Get a parallel stream from the collection. 从集合中获取并行流。
  2. Specify the operation(s) which will be done on the stream. 指定将在流上完成的操作。
  3. Do something with the result if you need to. 如果需要,可以对结果进行处理。

You may read first part of my explanation here: https://stackoverflow.com/a/22942829/2886891 您可以在这里阅读我的解释的第一部分: https : //stackoverflow.com/a/22942829/2886891

To create a parallel stream, invoke the operation .parallelStream on a Collection 要创建并行流, .parallelStream在Collection上调用.parallelStream操作。

See https://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html 参见https://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html

Arrays.asList(myArrayOfItems).parallelStream().forEach(item->{
    System.out.println("processing " + item.getId());
    item.process();
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM