简体   繁体   English

Stream.reduce(identity, accumulator, combiner) 如何工作?

[英]How Stream.reduce(identity, accumulator, combiner) works?

Stream.reduce has 3 method overloads. Stream.reduce有 3 个方法重载。

reduce(BinaryOperator<T> accumulator)
reduce(T identity, BinaryOperator<T> accumulator)
reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
  • 1st overload can be used to calculate sum of integer list for example.例如,第一次重载可用于计算 integer 列表的总和。
  • 2nd overload is the same but if the list is empty it just returns the default value.第二次重载是相同的,但如果列表为空,它只返回默认值。

I'm having a hard time understanding how third overload ( Stream.reduce(identity, accumulator, combiner) ) works and what is a use case of that.我很难理解第三个重载 ( Stream.reduce(identity, accumulator, combiner) ) 的工作原理以及它的用例。 So, how does it work, and why does that exists?那么,它是如何工作的,为什么会存在呢?

If I understand correctly, your question is about the third argument combiner .如果我理解正确的话,你的问题是关于第三个参数combiner

Firstly, one of the goals of Java was to have similar APIs for sequential and parallel streams.首先,Java 的目标之一是为顺序流和并行流提供相似的 API。 The 3-argument version of reduce is useful for parallel streams. reduce的 3 参数版本对于并行流很有用。

Suppose you are reducing from value of Collection<T> to U type and you are using parallel stream versions.假设您正在从Collection<T>的值减少到U类型,并且您正在使用并行 stream 版本。 The parallel stream splits the collection T into smaller streams and generates au' value for each by using the second function. But now these different u' values have to be combined?并行 stream 将集合 T 拆分为更小的流,并使用第二个 function 为每个流生成 au' 值。但是现在必须将这些不同的 u' 值组合起来? How do they get combined ?他们如何结合 The third function is the one that provides that logic.第三个 function 是提供该逻辑的那个。

Note: Some of the examples are contrived for demonstration.注意:一些示例是为演示而设计的。 In some instances a simple .sum() could have been used.在某些情况下,可以使用简单的.sum()

The big difference, imo, is that the third form has a BiFunction as a second argument instead of a BinaryOperator . imo,最大的区别是第三种形式将BiFunction作为第二个参数而不是BinaryOperator So you can use the third form to change the result type.所以你可以使用第三种形式来改变结果类型。 It also has a BinaryOperator as a combiner to combine the different results from parallel operations.它还有一个BinaryOperator作为组合器来组合并行操作的不同结果。

Generate some data生成一些数据

record Data(String name, int value) {}

Random r = new Random();
List<Data> dataList = r.ints(1000, 1, 20).mapToObj(i->new Data("Item"+i, i)).toList();

No parallel operation but different types.没有并行操作,但类型不同。 But the third argument is required so just return the sum.但是第三个参数是必需的,所以只返回总和。

int sum = dataList.stream().reduce(0, (item, data) -> item + data.value,
        (finalSum, partialSum) -> finalSum);
System.out.println(sum);

prints印刷

10162

The second form.第二种形式。 Use map to get the value to be summed.使用 map 得到要求和的值。 BinaryOperator used here since types are the same and no parallel operation.此处使用BinaryOperator ,因为类型相同且没有并行操作。

sum = dataList.stream().map(Data::value).reduce(0, (sum1,val)->sum1+val);
System.out.println(sum); // print same as above

This shows the same as above but in parallel.这显示与上面相同但并行。 The third argument accumulates partial sums.第三个参数累积部分和。 And those sums are accumulated as the next thread finishes so there may not be a sensible order to the output.这些总和是在下一个线程完成时累积的,因此 output 可能没有合理的顺序。

sum = dataList.parallelStream().reduce(0, (sum1, data) -> sum1 + data.value,
        (finalSum, partialSum) -> {
           
            System.out.println("Adding " + partialSum + " to " + finalSum);
            finalSum += partialSum;
            return finalSum;
        });
System.out.println(sum);

prints something like the following打印如下内容

Adding 586 to 670
Adding 567 to 553
Adding 1256 to 1120
Adding 715 to 620
Adding 624 to 601
Adding 1335 to 1225
Adding 2560 to 2376
Adding 662 to 579
Adding 706 to 715
Adding 1421 to 1241
Adding 713 to 689
Adding 576 to 586
Adding 1402 to 1162
Adding 2662 to 2564
Adding 4936 to 5226
10162

One final note.最后一点。 None of the Collectors.reducing methods have a BiFunction to handle different types. Collectors.reducing方法都没有BiFunction来处理不同的类型。 To handle this the second argument is a Function to act as a mapper so the third argument, a BinaryOperator can collect the values.为了处理这个问题,第二个参数是Function作为映射器,所以第三个参数BinaryOperator可以收集值。

sum = dataList.parallelStream().collect(
       Collectors.reducing(0, Data::value, (finalSum, partialSum) -> {
           System.out.println(
                   "Adding " + partialSum + " to " + finalSum);
           finalSum += partialSum;
           return finalSum;
       }));

System.out.println(sum);

It is interesting to note that for parallel operation the Collector version generates more threads with smaller chunks than the inline version.有趣的是,对于并行操作, Collector版本比inline版本生成更多线程,块更小。

That is to put things in add container.也就是把东西放到add container里。

An example would be as follows:一个例子如下:

  • identity = new ArrayList<>()
  • accumulator = (list, element) -> { list.add(element); return list; }
  • combiner = (listA, listB) -> { listA.addAll(listB); return listA; }

Basically it combines a mapping function with a reduction.基本上它结合了映射 function 和缩减。 Most of the examples I've seen for this don't really demonstrate why it's preferrable to calling map() and a normal reduce() in separate steps.我看到的大多数示例都没有真正说明为什么在单独的步骤中调用map()和普通的reduce()更可取。 The API Note comes in handy here: API Note 在这里派上用场:

Many reductions using this form can be represented more simply by an explicit combination of map and reduce operations.许多使用这种形式的归约可以通过mapreduce操作的显式组合来更简单地表示。 The accumulator function acts as a fused mapper and accumulator, which can sometimes be more efficient than separate mapping and reduction, such as when knowing the previously reduced value allows you to avoid some computation. accumulator function 充当融合映射器和累加器,有时比单独映射和缩减更有效,例如当知道先前缩减的值可以避免一些计算时。

So let's say we have a Stream<String> numbers , and we want to parse them to BigDecimal and calculate their product.假设我们有一个Stream<String> numbers ,我们想将它们解析为BigDecimal并计算它们的乘积。 We could do something like this:我们可以这样做:

BigDecimal product = numbers.map(BigDecimal::new)
        .reduce(BigDecimal.ONE, BigDecimal::multiply);

But this has an inefficiency.但这效率低下。 If one of the numbers is "0", we're wasting cycles converting the remainder to BigDecimal .如果其中一个数字是“0”,我们就会浪费将余数转换为BigDecimal的周期。 We can use the 3-arg reduce() here to bypass the mapping logic:我们可以在这里使用 3-arg reduce()来绕过映射逻辑:

BigDecimal product = numbers.reduce(BigDecimal.ONE,
        (d, n) -> d.equals(BigDecimal.ZERO) ? BigDecimal.ZERO : new BigDecimal(n).multiply(d),
        BigDecimal::multiply);

Of course it would be even more efficient to short-circuit the stream entirely, but that's tricky to do in a stream, especially in parallel.当然,将 stream 完全短路会更有效,但在 stream 中这样做很棘手,尤其是并联。 And this is just an example to get the concept across.这只是一个让这个概念得到理解的例子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么标识值必须是 Stream.reduce 中组合器函数的标识? - Why must the identity value be an identity for the combiner function in Stream.reduce? Stream.reduce如何(BinaryOperator <T> 累加器)初始化? - How does Stream.reduce(BinaryOperator<T> accumulator) initialized? Stream.reduce的累加器参数中通配符的用途是什么? - What is the purpose of wildcard in accumulator argument of Stream.reduce? Stream.reduce 与身份和 Stream.reduce().orElse() 的差异是按位或减少的情况 - Difference in Stream.reduce with identity and Stream.reduce().orElse() is case of bit wise OR reduction 为什么Stream ::中的累加器减少BiFunction而不是像组合器那样的BinaryOperator? - Why is the accumulator in Stream::reduce a BiFunction and not a BinaryOperator like the combiner? 如何对 Stream.reduce 方法进行单元测试 - How to unit test Stream.reduce Method 为什么Stream.reduce() javadoc调用初始值Identity? - Why Stream.reduce() javadoc calls initial value Identity? 流收集累加器/合并器命令 - stream collect accumulator/combiner order 为什么 Java 的 Stream.reduce 方法采用标识元素而不是默认结果? - Why does Java's Stream.reduce method take an identity element instead of a default result? stream 减少的示例与不同的组合器和累加器 - Example of stream reduction with distinct combiner and accumulator
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM