[英]How Stream.reduce(identity, accumulator, combiner) works?
Stream.reduce
has 3 method overloads. Stream.reduce
有 3 个方法重载。
reduce(BinaryOperator<T> accumulator)
reduce(T identity, BinaryOperator<T> accumulator)
reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
I'm having a hard time understanding how third overload ( Stream.reduce(identity, accumulator, combiner)
) works and what is a use case of that.我很难理解第三个重载 (
Stream.reduce(identity, accumulator, combiner)
) 的工作原理以及它的用例。 So, how does it work, and why does that exists?那么,它是如何工作的,为什么会存在呢?
If I understand correctly, your question is about the third argument combiner
.如果我理解正确的话,你的问题是关于第三个参数
combiner
。
Firstly, one of the goals of Java was to have similar APIs for sequential and parallel streams.首先,Java 的目标之一是为顺序流和并行流提供相似的 API。 The 3-argument version of
reduce
is useful for parallel streams. reduce
的 3 参数版本对于并行流很有用。
Suppose you are reducing from value of Collection<T>
to U
type and you are using parallel stream versions.假设您正在从
Collection<T>
的值减少到U
类型,并且您正在使用并行 stream 版本。 The parallel stream splits the collection T into smaller streams and generates au' value for each by using the second function. But now these different u' values have to be combined?并行 stream 将集合 T 拆分为更小的流,并使用第二个 function 为每个流生成 au' 值。但是现在必须将这些不同的 u' 值组合起来? How do they get combined ?
他们如何结合? The third function is the one that provides that logic.
第三个 function 是提供该逻辑的那个。
Note: Some of the examples are contrived for demonstration.注意:一些示例是为演示而设计的。 In some instances a simple
.sum()
could have been used.在某些情况下,可以使用简单的
.sum()
。
The big difference, imo, is that the third form has a BiFunction
as a second argument instead of a BinaryOperator
. imo,最大的区别是第三种形式将
BiFunction
作为第二个参数而不是BinaryOperator
。 So you can use the third form to change the result type.所以你可以使用第三种形式来改变结果类型。 It also has a
BinaryOperator
as a combiner to combine the different results from parallel operations.它还有一个
BinaryOperator
作为组合器来组合并行操作的不同结果。
Generate some data生成一些数据
record Data(String name, int value) {}
Random r = new Random();
List<Data> dataList = r.ints(1000, 1, 20).mapToObj(i->new Data("Item"+i, i)).toList();
No parallel operation but different types.没有并行操作,但类型不同。 But the third argument is required so just return the sum.
但是第三个参数是必需的,所以只返回总和。
int sum = dataList.stream().reduce(0, (item, data) -> item + data.value,
(finalSum, partialSum) -> finalSum);
System.out.println(sum);
prints印刷
10162
The second form.第二种形式。 Use map to get the value to be summed.
使用 map 得到要求和的值。
BinaryOperator
used here since types are the same and no parallel operation.此处使用
BinaryOperator
,因为类型相同且没有并行操作。
sum = dataList.stream().map(Data::value).reduce(0, (sum1,val)->sum1+val);
System.out.println(sum); // print same as above
This shows the same as above but in parallel.这显示与上面相同但并行。 The third argument accumulates partial sums.
第三个参数累积部分和。 And those sums are accumulated as the next thread finishes so there may not be a sensible order to the output.
这些总和是在下一个线程完成时累积的,因此 output 可能没有合理的顺序。
sum = dataList.parallelStream().reduce(0, (sum1, data) -> sum1 + data.value,
(finalSum, partialSum) -> {
System.out.println("Adding " + partialSum + " to " + finalSum);
finalSum += partialSum;
return finalSum;
});
System.out.println(sum);
prints something like the following打印如下内容
Adding 586 to 670
Adding 567 to 553
Adding 1256 to 1120
Adding 715 to 620
Adding 624 to 601
Adding 1335 to 1225
Adding 2560 to 2376
Adding 662 to 579
Adding 706 to 715
Adding 1421 to 1241
Adding 713 to 689
Adding 576 to 586
Adding 1402 to 1162
Adding 2662 to 2564
Adding 4936 to 5226
10162
One final note.最后一点。 None of the
Collectors.reducing
methods have a BiFunction
to handle different types. Collectors.reducing
方法都没有BiFunction
来处理不同的类型。 To handle this the second argument is a Function
to act as a mapper so the third argument, a BinaryOperator
can collect the values.为了处理这个问题,第二个参数是
Function
作为映射器,所以第三个参数BinaryOperator
可以收集值。
sum = dataList.parallelStream().collect(
Collectors.reducing(0, Data::value, (finalSum, partialSum) -> {
System.out.println(
"Adding " + partialSum + " to " + finalSum);
finalSum += partialSum;
return finalSum;
}));
System.out.println(sum);
It is interesting to note that for parallel operation the Collector
version generates more threads with smaller chunks than the inline
version.有趣的是,对于并行操作,
Collector
版本比inline
版本生成更多线程,块更小。
That is to put things in add container.也就是把东西放到add container里。
An example would be as follows:一个例子如下:
identity = new ArrayList<>()
accumulator = (list, element) -> { list.add(element); return list; }
combiner = (listA, listB) -> { listA.addAll(listB); return listA; }
Basically it combines a mapping function with a reduction.基本上它结合了映射 function 和缩减。 Most of the examples I've seen for this don't really demonstrate why it's preferrable to calling
map()
and a normal reduce()
in separate steps.我看到的大多数示例都没有真正说明为什么在单独的步骤中调用
map()
和普通的reduce()
更可取。 The API Note comes in handy here: API Note 在这里派上用场:
Many reductions using this form can be represented more simply by an explicit combination of
map
andreduce
operations.许多使用这种形式的归约可以通过
map
和reduce
操作的显式组合来更简单地表示。 Theaccumulator
function acts as a fused mapper and accumulator, which can sometimes be more efficient than separate mapping and reduction, such as when knowing the previously reduced value allows you to avoid some computation.accumulator
function 充当融合映射器和累加器,有时比单独映射和缩减更有效,例如当知道先前缩减的值可以避免一些计算时。
So let's say we have a Stream<String> numbers
, and we want to parse them to BigDecimal
and calculate their product.假设我们有一个
Stream<String> numbers
,我们想将它们解析为BigDecimal
并计算它们的乘积。 We could do something like this:我们可以这样做:
BigDecimal product = numbers.map(BigDecimal::new)
.reduce(BigDecimal.ONE, BigDecimal::multiply);
But this has an inefficiency.但这效率低下。 If one of the numbers is "0", we're wasting cycles converting the remainder to
BigDecimal
.如果其中一个数字是“0”,我们就会浪费将余数转换为
BigDecimal
的周期。 We can use the 3-arg reduce()
here to bypass the mapping logic:我们可以在这里使用 3-arg
reduce()
来绕过映射逻辑:
BigDecimal product = numbers.reduce(BigDecimal.ONE,
(d, n) -> d.equals(BigDecimal.ZERO) ? BigDecimal.ZERO : new BigDecimal(n).multiply(d),
BigDecimal::multiply);
Of course it would be even more efficient to short-circuit the stream entirely, but that's tricky to do in a stream, especially in parallel.当然,将 stream 完全短路会更有效,但在 stream 中这样做很棘手,尤其是并联。 And this is just an example to get the concept across.
这只是一个让这个概念得到理解的例子。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.