简体   繁体   English

未按计数评估中间流操作

[英]Intermediate stream operations not evaluated on count

It seems I'm having trouble understanding how Java composes stream operations into a stream pipeline.我似乎无法理解 Java 如何将流操作组合到流管道中。

When executing the following code执行以下代码时

public
 static void main(String[] args) {
    StringBuilder sb = new StringBuilder();

    var count = Stream.of(new String[]{"1", "2", "3", "4"})
            .map(sb::append)
            .count();

    System.out.println(count);
    System.out.println(sb.toString());
}

The console only prints 4 .控制台只打印4 The StringBuilder object still has the value "" . StringBuilder对象仍然具有值""

When I add the filter operation: filter(s -> true)当我添加过滤操作时: filter(s -> true)

public static void main(String[] args) {
    StringBuilder sb = new StringBuilder();

    var count = Stream.of(new String[]{"1", "2", "3", "4"})
            .filter(s -> true)
            .map(sb::append)
            .count();

    System.out.println(count);
    System.out.println(sb.toString());
}

The output changes to:输出更改为:

4
1234

How does this seemingly redundant filter operation change the behavior of the composed stream pipeline?这个看似多余的过滤操作如何改变组合流管道的行为?

The count() terminal operation, in my version of the JDK, ends up executing the following code:在我的 JDK 版本中, count()终端操作最终会执行以下代码:

if (StreamOpFlag.SIZED.isKnown(helper.getStreamAndOpFlags()))
    return spliterator.getExactSizeIfKnown();
return super.evaluateSequential(helper, spliterator);

If there is a filter() operation in the pipeline of operations, the size of the stream, which is known initially, can't be known anymore (since filter could reject some elements of the stream).如果在操作管道中存在filter()操作,则最初已知的流的大小将无法再知道(因为filter可能会拒绝流的某些元素)。 So the if block is not executed, the intermediate operations are executed and the StringBuilder is thus modified.所以不执行if块,执行中间操作,从而修改 StringBuilder。

On the other hand, If you only have map() in the pipeline, the number of elements in the stream is guaranteed to be the same as the initial number of elements.另一方面,如果管道中只有map() ,则流中的元素数量保证与初始元素数量相同。 So the if block is executed, and the size is returned directly without evaluating the intermediate operations.所以执行了if块,直接返回size,不求中间操作。

Note that the lambda passed to map() violates the contract defined in the documentation: it's supposed to be a non-interfering, stateless operation, but it is not stateless.请注意,传递给map()的 lambda 违反了文档中定义的契约:它应该是一个无干扰的无状态操作,但它不是无状态的。 So having a different result in both cases can't be considered as a bug.因此,在两种情况下都有不同的结果不能被视为错误。

In jdk-9 it was clearly documented in java docsjdk-9 中,它清楚地记录在 java docs 中

The eliding of side-effects may also be surprising.消除副作用也可能令人惊讶。 With the exception of terminal operations forEach and forEachOrdered, side-effects of behavioral parameters may not always be executed when the stream implementation can optimize away the execution of behavioral parameters without affecting the result of the computation.除了 forEach 和 forEachOrdered 的终端操作之外,当流实现可以在不影响计算结果的情况下优化掉行为参数的执行时,行为参数的副作用可能并不总是被执行。 (For a specific example see the API note documented on the count operation.) (有关特定示例,请参阅有关计数操作的 API 说明。)

API Note: API注意事项:

An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) if it is capable of computing the count directly from the stream source.如果能够直接从流源计算计数,则实现可以选择不执行流管道(顺序或并行)。 In such cases no source elements will be traversed and no intermediate operations will be evaluated.在这种情况下,不会遍历源元素,也不会评估中间操作。 Behavioral parameters with side-effects, which are strongly discouraged except for harmless cases such as debugging, may be affected.强烈建议不要使用具有副作用的行为参数,除了调试等无害情况外,可能会受到影响。 For example, consider the following stream:例如,考虑以下流:

 List<String> l = Arrays.asList("A", "B", "C", "D");
 long count = l.stream().peek(System.out::println).count();

The number of elements covered by the stream source, a List, is known and the intermediate operation, peek, does not inject into or remove elements from the stream (as may be the case for flatMap or filter operations).流源(一个 List)覆盖的元素数量是已知的,中间操作 peek 不会从流中注入或删除元素(对于 flatMap 或过滤器操作可能就是这种情况)。 Thus the count is the size of the List and there is no need to execute the pipeline and, as a side-effect, print out the list elements.因此,计数是列表的大小,并且不需要执行管道,并且作为副作用,打印出列表元素。

This is not what .map is for.这不是 .map 的用途。 It is supposed to be used to turn a stream of "Something" into a stream of "Something Else".它应该用于将“Something”流转换为“Something Else”流。 In this case, you are using map to append a string to an external Stringbuilder, after which you have a stream of "Stringbuilder", each of which was created by the map operation appending one number to the original Stringbuilder.在这种情况下,您使用 map 将字符串附加到外部 Stringbuilder,之后您有一个“Stringbuilder”流,每个流都是由 map 操作创建的,将一个数字附加到原​​始 Stringbuilder。

Your stream doesn't actually do anything with mapped results in the stream, so it's perfectly reasonable to assume that the step can be skipped by the stream processor.您的流实际上并未对流中的映射结果执行任何操作,因此假设流处理器可以跳过该步骤是完全合理的。 You're counting on side effects to do the work, which breaks the functional model of the map.您依靠副作用来完成这项工作,这打破了地图的功能模型。 You'd be better served by using forEach to do this.使用 forEach 执行此操作会更好地为您服务。 Do the count as a separate stream entirely, or put a counter using AtomicInt in the forEach.将计数完全作为单独的流进行,或者在 forEach 中使用 AtomicInt 放置一个计数器。

The filter forces it to run the stream contents since the it now has to do something notionally meaningful with each stream element.过滤器强制它运行流内容,因为它现在必须对每个流元素做一些理论上有意义的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM