简体   繁体   English

sorted() 和 concat() 的奇怪流行为

[英]Weird streams behavior with sorted() and concat()

Stream evaluation is usually lazy (by default), unless statefull operations exist as part of the pipeline. Stream 评估通常是惰性的(默认情况下),除非有状态操作作为管道的一部分存在。 I encountered a case where the lazyness is violated due to stateful operation and I don't understand why it happens.我遇到过因为有状态操作而违反惰性的情况,我不明白为什么会这样。

Consider the following code:考虑以下代码:

List<Integer> l1 = List.of(4, 5, 3, 1, 2);
List<Integer> l2 = List.of(6, 7, 8, 9, 10);

Stream
    .concat(
        l1.stream()
            .map(i -> {
                System.out.println("in the map for " + i);
                if (i % 3 != 0) {
                    return null;
                }
                return i;
            }),
        l2.stream())
    .filter(i -> {
        System.out.println("in the filter " + i);
        return i != null;
    })
    .findAny();

In details:详情:

I have two steams constructed out of integer lists ( l1 & l2 ).我有两个由 integer 列表( l1l2 )构成的流。 Both streams are concatenated to form a new stream.两个流连接起来形成一个新的 stream。

The l1 stream goes through some mapping that converts every item not divisible by 3 to null ; l1 stream 经过一些映射,将不能被 3 整除的每个项目转换为null the l2 stream is taken as is. l2 stream 按原样使用。 On the concatenated stream, I am adding a filter (filters the non-null values only --> so from the first stream, only the items divided to 3 will go through the pipeline) and finally a terminal operation findAny which triggers the stream's pipeline (and will effectively deliver back the first item divisible by 3 and stop the stream processing).在串联的 stream 上,我添加了一个过滤器(仅过滤非空值 --> 因此从第一个 stream 开始,只有分成 3 的项目将通过管道 go),最后是触发流管道的终端操作findAny (并将有效地返回可被 3 整除的第一项并停止 stream 处理)。

This code works as expected: first all l1 items are consumed before l2 items are reached.此代码按预期工作:首先在达到l2项之前消耗所有l1项。 The output shows how l1 mapping function is called followed by the concatenated-stream-filter function for the first two l1 items, and the whole stream is finished when the 3rd item of l1 is not converted to null and thus survives the filter: output 显示了如何调用l1映射 function,然后是前两个l1项目的串联流过滤器 function,当l1的第三个项目未转换为 null 并因此在过滤器中存活时,整个 stream 完成:

in the map for 4
in the filter null
in the map for 5
in the filter null
in the map for 3
in the filter 3

The problem (or the thing I don't understand) starts when the l1 stream is modified with the .sorted operation:当使用.sorted操作修改l1 stream 时,问题(或我不明白的事情)开始了:

Stream
    .concat(
        l1.stream()
            .sorted()
            .map(i -> {
                System.out.println("in the map for " + i);
                if (i % 3 != 0) {
                    return null;
                }
                return i;
            }),
        l2.stream())
    .filter(i -> {
        System.out.println("in the filter " + i);
        return i != null;
    })
    .findAny();

... now things look different: ...现在情况看起来不一样了:

in the map for 1
in the map for 2
in the map for 3
in the map for 4
in the map for 5
in the filter null
in the filter null
in the filter 3

as sorted is statefull operation, I know that it first needs to consume the entire l1 stream to sort its values.由于 sorted 是有状态操作,我知道它首先需要消耗整个l1 stream 来对其值进行排序。 My surprise came as it seems that it also affects the remaining of the l1 pipeline as the map function is called eagerly before any of the concatenated-stream-filter method invocations, as was before.令我惊讶的是,它似乎也影响l1管道的其余部分,因为map function 在任何串联流过滤器方法调用之前被急切地调用,就像以前一样。

I read Java Streams - Buffering huge streams and Why filter() after flatMap() is "not completely" lazy in Java streams?我阅读了 Java Streams - Buffering huge streams为什么 flatMap() 之后的 filter() 在 Java 流中“不完全”懒惰? , and I am already running on java 17 and working with Stream.concat() and I am not using flatMap() (at least not explicitly). ,并且我已经在 java 17 上运行并使用Stream.concat()并且我没有使用flatMap() (至少没有明确使用)。

Can you explain why?你能解释为什么吗? What am I missing here?我在这里错过了什么?

This is caused by JDK-8277306: stream with sorted() and concat() causes some ops not to be lazy , which was closed as “ Won't fix ” with the following comment:这是由JDK-8277306 引起的:stream with sorted() and concat() causes some ops not to be lazy ,它被关闭为“ Won't fix ”并附有以下评论:

Stream.concat takes the spliterator from each input stream and combines them to create a new spliterator from which a new stream is constructed. Stream.concat 从每个输入 stream 中获取拆分器,并将它们组合起来创建一个新的拆分器,从中构建一个新的 stream。 Thereby it binds to the sources of each stream to concatenate.因此它绑定到每个 stream 的源以连接。

It is currently not possible propagate the short circuiting property of the stream pipeline after the concatenation to pipeline before the concatenation.目前不可能将串联后的 stream 管道的短路属性传播到串联前的管道。 It comes down to resolving the push vs. pull differences across the spliterator boundary.归结为解决拆分器边界上的推与拉差异。 That's a tricky problem, and one that Ilikely requires significant effort that I find hard to justify given the scope of the problem.这是一个棘手的问题,而且我可能需要付出巨大的努力,鉴于问题的 scope,我发现很难证明这是合理的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM