简体   繁体   English

java 流是否能够从映射/过滤条件中延迟减少?

[英]Are java streams able to lazilly reduce from map/filter conditions?

I am using a functional programming style to solve the Leetcode easy question, Count the Number of Consistent Strings .我正在使用函数式编程风格来解决 Leetcode 的简单问题, Count the Number of Consistent Strings The premise of this question is simple: count the amount of values for which the predicate of "all values are in another set" holds.这个问题的前提很简单:计算“所有值都在另一个集合中”的谓词所持有的值的数量。

I have two approaches, one which I am fairly certain behaves as I want it to, and the other which I am less sure about.我有两种方法,一种我相当肯定会按照我的意愿行事,另一种我不太确定。 Both produce the correct output, but ideally they would stop evaluating other elements after the output is in a final state.两者都产生正确的 output,但理想情况下,它们会在 output 进入最终 state 之后停止评估其他元素。


    public int countConsistentStrings(String allowed, String[] words) {
        final Set<Character> set = allowed.chars()
          .mapToObj(c -> (char)c)
          .collect(Collectors.toCollection(HashSet::new));
        return (int)Arrays.stream(words)
          .filter(word ->
                  word.chars()
                  .allMatch(c -> set.contains((char)c))
                 )
          .count();
    }

In this solution, to the best of my knowledge, the allMatch statement will terminate and evaluate to false at the first instance of c for which the predicate does not hold true, skipping the other values in that stream.在此解决方案中,据我所知,allMatch 语句将在 c 的第一个实例中终止并评估为 false,其中谓词不成立,跳过该 stream 中的其他值。


    public int countConsistentStrings(String allowed, String[] words) {
        Set<Character> set = allowed.chars()
          .mapToObj(c -> (char)c)
          .collect(Collectors.toCollection(HashSet::new));
        return (int)Arrays.stream(words)
          .filter(word ->
                  word.chars()
                  .mapToObj(c -> set.contains((char)c))
                  .reduce((a,b) -> a&&b)
                  .orElse(false)
                 )
          .count();
    }

In this solution, the same logic is used but instead of allMatch , I use map and then reduce .在此解决方案中,使用相同的逻辑,但我使用map而不是allMatch ,然后使用reduce Logically, after a single false value comes from the map stage, reduce will always evaluate to false .从逻辑上讲,在来自map阶段的单个false值之后, reduce将始终评估为false I know Java streams are lazy, but I am unsure when they ''know'' just how lazy they can be.我知道 Java 流很懒,但我不确定他们什么时候“知道”他们有多懒。 Will this be less efficient than using allMatch or will laziness ensure the same operation?这会比使用allMatch效率低,还是懒惰会确保相同的操作?


Lastly, in this code, we can see that the value for x will always be 0 as after filtering for only positive numbers, the sum of them will always be positive (assume no overflow) so taking the minimum of positive numbers and a hardcoded 0 will be 0. Will the stream be lazy enough to evaluate this to 0 always, or will it work to reduce every element after the filter anyways?最后,在这段代码中,我们可以看到x的值将始终为 0,因为在仅过滤正数之后,它们的总和将始终为正(假设没有溢出),因此取正数的最小值和硬编码 0将为 0。stream 是否会懒得将其评估为 0,或者它是否会在过滤器之后减少每个元素?

List<Integer> list = new ArrayList<>();
...
/*Some values added to list*/
...
int x = list.stream()
        .filter(i -> i >= 0)
        .reduce((a,b) -> Math.min(a+b, 0))
        .orElse(0);

To summarize the above, how does one know when the Java stream will be lazy?综上所述,怎么知道Java stream什么时候会偷懒呢? There are lazy opportunities that I see in the code, but how can I guarantee that my code will be as lazy as possible?我在代码中看到了偷懒的机会,但是我如何保证我的代码会尽可能地偷懒呢?

The actual term you're asking for is short-circuiting您要求的实际术语是短路

Further, some operations are deemed short-circuiting operations.此外,一些操作被认为是短路操作。 An intermediate operation is short-circuiting if, when presented with infinite input, it may produce a finite stream as a result.如果在出现无限输入时,中间操作可能会产生有限的 stream,则它是短路的。 A terminal operation is short-circuiting if, when presented with infinite input, it may terminate in finite time.如果一个终端操作在有无限输入时可能会在有限时间内终止,那么它就是短路的。 Having a short-circuiting operation in the pipeline is a necessary, but not sufficient, condition for the processing of an infinite stream to terminate normally in finite time.在流水线中进行短路操作是无限 stream 处理在有限时间内正常终止的必要条件,但不是充分条件。

The term “lazy” only applies to intermediate operations and means that they only perform work when being requested by the terminal operation.术语“懒惰”仅适用于中间操作,意味着它们仅在终端操作请求时才执行工作。 This is always the case, so when you don't chain a terminal operation, no intermediate operation will ever process any element.情况总是如此,因此当您不链接终端操作时,任何中间操作都不会处理任何元素。

Finding out whether a terminal operation is short-circuiting, is rather easy.找出终端操作是否短路是相当容易的。 Go to the Stream API documentation and check whether the particular terminal operation's documentation contains the sentence Go 到Stream API 文档并检查特定终端操作的文档是否包含该语句

This is a short-circuiting terminal operation.这是一个短路端子操作。

allMatch has it , reduce has not . allMatchreduce没有

This does not mean that such optimizations based on logic or algebra are impossible.这并不意味着这种基于逻辑或代数的优化是不可能的。 But the responsibility lies at the JVM's optimizer which might do the same for loops.但责任在于 JVM 的优化器,它可能对循环执行相同的操作。 However, this requires inlining of all involved methods to be sure that this conditions always applies and there are no side effect which must be retained.但是,这需要内联所有涉及的方法,以确保始终适用此条件并且没有必须保留的副作用。 This behavioral compatibility implies that even if the processing gets optimized away, a peek(System.out::println) would keep printing all elements as if they were processed.这种行为兼容性意味着即使处理得到优化, peek(System.out::println)也会继续打印所有元素,就好像它们已经过处理一样。 In practice, you should not expect such optimizations, as the Stream implementation code is too complex for the optimizer.在实践中,您不应该期望这样的优化,因为 Stream 实现代码对于优化器来说太复杂了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM