简体   繁体   English

Java8-过滤流后计数

[英]Java8 - Count after filter on stream

I hope this question was not asked before. 我希望这个问题以前没有问过。 In java 8, I have an array of String myArray in input and an integer maxLength . 在Java 8中,我在输入中有一个字符串myArray数组,一个整数maxLength I want to count the number of string in my array smaller than maxLength. 我想计算数组中小于maxLength的字符串数。 I WANT to use stream to resolve this issue. 我想使用流来解决此问题。

For that I thought to do this : 为此,我想这样做:

int solution = Arrays.stream(myArray).filter(s -> s.length() <= maxLength).count();

However I'm not sure if it is the right way to do this. 但是,我不确定这是否是正确的方法。 It will need to go through first array once and then go through the filtered array to count. 它需要遍历第一个数组一次,然后遍历过滤后的数组进行计数。

But if I don't use a stream, I could easely make an algorithm where I loop once over myArray. 但是,如果我不使用流,则可以轻松地创建一个算法,在该算法上一次遍历myArray。

My questions are very easy: Is there a way to resolve this issue with the same time performance than with a loop ? 我的问题很简单:与循环相比,有没有一种方法可以同时解决此问题? Is it always a "good" solution to use stream ? 使用流是否总是一个“好的”解决方案?

However I'm not sure if it is the right way to do this. 但是,我不确定这是否是正确的方法。 It will need to go through first array once and then go through the filtered array to count. 它需要遍历第一个数组一次,然后遍历过滤后的数组进行计数。

Your assumption that it will perform multiple passes is wrong. 您认为它将执行多次传递是错误的。 There is something calling operation fusion ie multiple operations can be executed in a single pass on the data; 有一种调用操作融合的东西,即可以一次通过数据执行多个操作。

In this case Arrays.stream(myArray) will create a stream object (cheap operation and lightweight object) , filter(s -> s.length() <= maxLength).count(); 在这种情况下, Arrays.stream(myArray)将创建一个流对象(廉价操作和轻量级对象), filter(s -> s.length() <= maxLength).count(); will be combined into a single pass on the data because there is no stateful operation in the pipeline as opposed to filtering all the elements of the stream and then counting all the elements which pass the predicate. 将被组合为一次数据传递,因为管道中没有状态操作,而不是过滤流的所有元素,然后对通过谓词的所有元素进行计数。

A quote from Brian Goetz post here states: Brian Goetz 在这里的一句话引述:

Stream pipelines, in contrast, fuse their operations into as few passes on the data as possible, often a single pass. 相反,流管道将其操作融合到尽可能少的数据传递中,通常是一次传递。 (Stateful intermediate operations, such as sorting, can introduce barrier points that necessitate multipass execution.) (有状态的中间操作(例如排序)可能会引入需要多次执行的障碍点。)

As for: 至于:

My questions are very easy: Is there a way to resolve this issue with the same time performance than with a loop ? 我的问题很简单:与循环相比,有没有一种方法可以同时解决此问题?

depends on the amount of data and cost per element. 取决于数据量和每个元素的成本。 Anyhow, for a small number of elements the imperative for loops will almost always win if not always. 无论如何,对于少数元素而言,for循环的必要条件几乎总是会获胜,即使并非总是如此。

Is it always a "good" solution to use stream ? 使用流是否总是一个“好的”解决方案?

No, if you really care about performance then measure , measure and measure . 不,如果您真的很在意性能,请衡量衡量衡量

Use streams for it being declarative, for its abstraction, composition and the possibility of benefitting from parallelism when you know you will benefit from it that is . 使用流来声明它,因为它的抽象,组成以及当您知道将从中受益时,可以从并行中受益。

You can use range instead of stream and filter the output. 您可以使用range而不是stream并过滤输出。

int solution = IntStream.range(0, myArray.length)
                .filter(index -> myArray[index].length() <= maxLength)
                .count();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM