简体   繁体   English

Java 8 Collector UNORDERED 特性是什么意思?

[英]What does the Java 8 Collector UNORDERED characteristic mean?

In official documentation you can read that:在官方文档中,您可以阅读:

UNORDERED Indicates that the collection operation does not commit to preserving the encounter order of input elements. UNORDERED指示集合操作不致力于保留输入元素的遇到顺序。

This is not too helpful without any examples.如果没有任何示例,这并没有太大帮助。

My question is, what exactly does UNORDERED characteristic mean?我的问题是, UNORDERED特征究竟是什么意思? Should I use it with reducing collectors like min or sum or is it only applicable to collection collectors?我应该将它与 min 或 sum 等减少收集器一起使用还是仅适用于收集收集器?

In OpenJDK looks like reducing operations (min, sum, avg) have empty characteristics.在 OpenJDK 中,似乎减少操作(min、sum、avg)具有空特性。 I expected to find there at least CONCURRENT and UNORDERED .我希望至少能在那里找到CONCURRENTUNORDERED

In the absence of special pleading, stream operations must behave as if the elements are processed in the encounter order of the source.在没有特殊请求的情况下,流操作必须表现得好像元素是按照源的遇到顺序处理的。 For some operations -- such as reduction with an associative operation -- one can obey this constraint and still get efficient parallel execution.对于某些操作——例如使用关联操作的归约——可以遵守这一约束并仍然获得高效的并行执行。 For others, though, this constraint is very limiting.但是,对于其他人来说,这种约束非常有限。 And, for some problems, this constraint isn't meaningful to the user.而且,对于某些问题,此约束对用户没有意义。 Consider the following stream pipeline:考虑以下流管道:

people.stream()
      .collect(groupingBy(Person::getLastName, 
                          mapping(Person::getFirstName));

Is it important that the list of first names associated with "Smith" appear in the map in the order they appeared in the initial stream?与“Smith”相关联的名字列表按照它们在初始流中出现的顺序出现在地图中是否重要? For some problems, yes, for some no -- we don't want the stream library guessing for us.对于某些问题,是的,对于某些问题,我们不希望流库为我们猜测。 An unordered collector says that it's OK to insert the first names into the list in an order inconsistent with the order in which Smith-surnamed people appear in the input source.一位无序收集者表示,可以按照与 Smith 姓氏人员在输入源中出现的顺序不一致的顺序将名字插入列表中。 By relaxing this constraint, sometimes (not always), the stream library can give a more efficient execution.通过放宽此约束,有时(并非总是),流库可以提供更有效的执行。

For example, if you didn't care about this order preservation, you could execute it as:例如,如果您不关心此订单保留,则可以将其执行为:

people.parallelStream()
      .collect(groupingByConcurrent(Person::getLastName, 
                                    mapping(Person::getFirstName));

The concurrent collector is unordered, which permits the optimization of sharing an underlying ConcurrentMap , rather than having O(log n) map-merge steps.并发收集器是无序的,这允许优化共享底层ConcurrentMap ,而不是O(log n)映射合并步骤。 Relaxing the ordering constraint enables a real algorithmic advantage -- but we can't assume the constraint doesn't matter, we need for the user to tell us this.放宽排序约束可以带来真正的算法优势——但我们不能假设约束无关紧要,我们需要用户告诉我们这一点。 Using an UNORDERED collector is one way to tell the stream library that these optimizations are fair game.使用UNORDERED收集器是告诉流库这些优化是公平游戏的一种方式。

UNORDERED essentially means that the collector is both associative (required by the spec) and commutative (not required). UNORDERED本质上意味着收集器既是关联的(规范要求的)又是可交换的(不是必需的)。

Associativity allows splitting the computation into subparts and then combining them into the full result, but requires the combining step to be strictly ordered.关联性允许将计算拆分为子部分,然后将它们组合成完整的结果,但需要对组合步骤进行严格排序。 Examine this snippet from the docs :检查文档中的这个片段:

 A a2 = supplier.get();
 accumulator.accept(a2, t1);
 A a3 = supplier.get();
 accumulator.accept(a3, t2);
 R r2 = finisher.apply(combiner.apply(a2, a3));  // result with splitting

In the last step, combiner.apply(a2, a3) , the arguments must appear in exactly this order, which means that the entire computation pipeline must track the order and respect it in the end.在最后一步combiner.apply(a2, a3) ,参数必须完全按照这个顺序出现,这意味着整个计算管道必须跟踪顺序并最终遵守它。

Another way of saying this is that the tree we get from recursive splitting must be ordered.另一种说法是,我们从递归分裂中得到的树必须是有序的。

On the other hand, if the combining operation is commutative, we can combine any subpart with any other, in no particular order, and always obtain the same result.另一方面,如果组合操作是可交换的,我们可以将任何子部分与任何其他子部分组合,没有特定的顺序,并且总是得到相同的结果。 Clearly this leads to many optimization opportunities in both space and time dimensions.显然,这会在空间和时间维度上带来许多优化机会。

It should be noted that there are UNORDERED collectors in the JDK which don't guarantee commutativity.应该注意的是,JDK 中有UNORDERED收集器,它们不保证可交换性。 The main category are the "higher-order" collectors which are composed with other downstream collectors, but they don't enforce the UNORDERED property on them.主要类别是与其他下游收集器组成的“高阶”收集器,但它们不会对它们强制执行UNORDERED属性。

The inner Collector.Characteristics class itself is fairly terse in its description, but if you spend a few seconds exploring the context you will notice that the containing Collector interface provides additional information内部Collector.Characteristics类本身的描述相当简洁,但如果您花几秒钟探索上下文,您会注意到包含的Collector接口提供了额外的信息

For collectors that do not have the UNORDERED characteristic, two accumulated results a1 and a2 are equivalent if finisher.apply(a1).equals(finisher.apply(a2)).对于不具有 UNORDERED 特征的收集器,如果 finisher.apply(a1).equals(finisher.apply(a2)),则两个累加结果 a1 和 a2 是等效的。 For unordered collectors, equivalence is relaxed to allow for non-equality related to differences in order.对于无序收集器,放宽了等效性以允许与顺序差异相关的不相等性。 (For example, an unordered collector that accumulated elements to a List would consider two lists equivalent if they contained the same elements, ignoring order.) (例如,将元素累积到 List 的无序收集器会认为两个列表等效,如果它们包含相同的元素,而忽略顺序。)


In OpenJDK looks like reducing operations (min, sum, avg) have empty characteristics, I expected to find there at least CONCURRENT and UNORDERED.在 OpenJDK 中,减少操作(min、sum、avg)似乎具有空特性,我希望至少可以找到 CONCURRENT 和 UNORDERED。

At least for doubles summation and averages definitely are ordered and not concurrent because the summation logic uses subresult merging, not a thread-safe accumulator.至少对于双打求和和平均值肯定是有序的而不是并发的,因为求和逻辑使用子结果合并,而不是线程安全累加器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM