简体   繁体   English

如何确保java8流中的处理顺序?

[英]How to ensure order of processing in java8 streams?

I want to process lists inside an XML java object.我想处理XML java 对象内的列表。 I have to ensure processing all elements in order I received them.我必须确保按顺序处理所有元素。

Should I therefore call sequential on each stream I use?因此,我应该在我使用的每个stream上调用sequential吗? list.stream().sequential().filter().forEach()

Or it it sufficient to just use the stream as long as I don't use parallelism?或者只要我不使用并行性就可以使用流? list.stream().filter().forEach()

You are asking the wrong question.你问错了问题。 You are asking about sequential vs. parallel whereas you want to process items in order , so you have to ask about ordering .你是问sequentialparallel ,而要处理的项目,以便,所以你要问排序 If you have an ordered stream and perform operations which guarantee to maintain the order, it doesn't matter whether the stream is processed in parallel or sequential;如果您有一个有序的流并执行保证保持顺序的操作,那么该流是并行处理还是顺序处理无关紧要; the implementation will maintain the order.执行将维持秩序。

The ordered property is distinct from parallel vs sequential.有序属性不同于并行与顺序。 Eg if you call stream() on a HashSet the stream will be unordered while calling stream() on a List returns an ordered stream.例如,如果您在HashSet上调用stream() ,则流将是无序的,而在List上调用stream()返回有序流。 Note that you can call unordered() to release the ordering contract and potentially increase performance.请注意,您可以调用unordered()来释放订购合同并可能提高性能。 Once the stream has no ordering there is no way to reestablish the ordering.一旦流没有排序,就无法重新建立排序。 (The only way to turn an unordered stream into an ordered is to call sorted , however, the resulting order is not necessarily the original order). (将无序流转换为有序流的唯一方法是调用sorted ,但是,结果顺序不一定是原始顺序)。

See also the “Ordering” section of the java.util.stream package documentation .另请参阅java.util.stream包文档“订购”部分

In order to ensure maintenance of ordering throughout an entire stream operation, you have to study the documentation of the stream's source, all intermediate operations and the terminal operation for whether they maintain the order or not (or whether the source has an ordering in the first place).为了确保在整个流操作中保持排序,您必须研究流的源、所有中间操作和终端操作的文档,以确定它们是否维护顺序(或源是否在第一个中具有排序)地方)。

This can be very subtle, eg Stream.iterate(T,UnaryOperator) creates an ordered stream while Stream.generate(Supplier) creates an unordered stream.这可能非常微妙,例如Stream.iterate(T,UnaryOperator)创建一个有序流而Stream.generate(Supplier)创建一个无序流。 Note that you also made a common mistake in your question as forEach does not maintain the ordering.请注意,您在问题中也犯了一个常见错误,因为forEach维护顺序。 You have to use forEachOrdered if you want to process the stream's elements in a guaranteed order.如果要以有保证的顺序处理流的元素,则必须使用forEachOrdered

So if your list in your question is indeed a java.util.List , its stream() method will return an ordered stream and filter will not change the ordering.因此,如果您的问题中的list确实是java.util.List ,则其stream()方法将返回有序流,而filter不会更改排序。 So if you call list.stream().filter() .forEachOrdered() , all elements will be processed sequentially in order, whereas for list.parallelStream().filter().forEachOrdered() the elements might be processed in parallel (eg by the filter) but the terminal action will still be called in order (which obviously will reduce the benefit of parallel execution).因此,如果您调用list.stream().filter() .forEachOrdered() ,所有元素将按顺序依次处理,而对于list.parallelStream().filter().forEachOrdered()元素可能会并行处理(例如通过过滤器)但终端操作仍将按顺序调用(这显然会减少并行执行的好处)。

If you, for example, use an operation like例如,如果您使用类似的操作

List<…> result=inputList.parallelStream().map(…).filter(…).collect(Collectors.toList());

the entire operation might benefit from parallel execution but the resulting list will always be in the right order, regardless of whether you use a parallel or sequential stream.整个操作可能受益于并行执行,但结果列表将始终按正确顺序排列,无论您使用并行流还是顺序流。

In a nutshell:简而言之:

Ordering depends on the source data structure and intermediate stream operations.排序取决于源数据结构和中间流操作。 Assuming you are using a List the processing should be ordered (since filter won't change the sequence here).假设您使用的是List则应该对处理进行排序(因为filter不会改变此处的顺序)。

More details:更多细节:

Sequential vs Parallel vs Unordered:顺序 vs 并行 vs 无序:

Javadocs 文档

S sequential()
Returns an equivalent stream that is sequential. May return itself, either because the stream was already sequential, or because the underlying stream state was modified to be sequential.
This is an intermediate operation.
S parallel()
Returns an equivalent stream that is parallel. May return itself, either because the stream was already parallel, or because the underlying stream state was modified to be parallel.
This is an intermediate operation.
S unordered()
Returns an equivalent stream that is unordered. May return itself, either because the stream was already unordered, or because the underlying stream state was modified to be unordered.
This is an intermediate operation.

Stream Ordering:流排序:

Javadocs 文档

Streams may or may not have a defined encounter order.流可能有也可能没有定义的相遇顺序。 Whether or not a stream has an encounter order depends on the source and the intermediate operations.流是否具有遇到顺序取决于源和中间操作。 Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not.某些流源(例如 List 或数组)在本质上是有序的,而其他流源(例如 HashSet)则不是。 Some intermediate operations, such as sorted(), may impose an encounter order on an otherwise unordered stream, and others may render an ordered stream unordered, such as BaseStream.unordered().一些中间操作,例如 sorted(),可能会在原本无序的流上施加遇到顺序,而其他操作可能会呈现无序的有序流,例如 BaseStream.unordered()。 Further, some terminal operations may ignore encounter order, such as forEach().此外,某些终端操作可能会忽略遇到顺序,例如 forEach()。

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order;如果流是有序的,则大多数操作都被限制为按元素遇到的顺序对元素进行操作; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6].如果流的源是一个包含[1, 2, 3]的List,那么执行map(x -> x*2)的结果一定是[2, 4, 6]。 However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result.但是,如果源没有定义的相遇顺序,则值 [2, 4, 6] 的任何排列都将是有效结果。

For sequential streams, the presence or absence of an encounter order does not affect performance, only determinism.对于顺序流,遇到顺序的存在与否不会影响性能,只会影响确定性。 If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result;如果流是有序的,在相同的源上重复执行相同的流管道将产生相同的结果; if it is not ordered, repeated execution might produce different results.如果没有排序,重复执行可能会产生不同的结果。

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution.对于并行流,放宽排序约束有时可以实现更高效的执行。 Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant.如果元素的排序不相关,则可以更有效地实现某些聚合操作,例如过滤重复项 (distinct()) 或分组归约 (Collectors.groupingBy())。 Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism.类似地,本质上与遇到顺序相关的操作,例如 limit(),可能需要缓冲以确保正确排序,从而破坏了并行性的好处。 In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations.在流具有遇到顺序但用户并不特别关心该遇到顺序的情况下,使用 unordered() 显式对流进行排序可能会提高某些有状态或终端操作的并行性能。 However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.然而,大多数流管道,例如上面的“块权重总和”示例,即使在排序约束下仍然有效地并行化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM