简体   繁体   中英

estimateSize() on sequential Spliterator

I'm implementing a Spliterator that explicitly restricts parallelization by having trySplit() return null . Would implementing estimateSize() offer any performance improvements for a stream produced by this spliterator? Or is the estimated size only useful for parallelization?

EDIT: To clarify, I'm specifically asking about an estimated size. In other words, my spliterator does not have the SIZED characteristic.

Looking at the call hierarchy to the relevant spliterator characteristic reveals that it's at least relevant for stream.toArray() performance

在此处输入图片说明

Additionally there is an equivalent flag in the internal stream implementation that seems to be used for sorting:

在此处输入图片说明

So aside from parallel stream operations the size estimate seems to be used for those two operations.

I don't claim exhaustiveness for my search, so just take these as examples.


Without the SIZED characteristic I can only find calls to estimateSize() that are relevant to parallel execution of the stream pipeline.

Of course this might change in the future or another Stream implementation than the standard JDK one could act differently.

A spliterator may traverse elements:

1.Individually( tryAdvance() )

2.Sequentially in bulk( forEachRemaining() )

As per java docs estimateSize() comes handy during splitting.

Spliterators can provide an estimate of the number of remaining elements via the estimateSize() method. Ideally, as reflected in characteristic SIZED, this value corresponds exactly to the number of elements that would be encountered in a successful traversal. However, even when not exactly known, an estimated value value may still be useful to operations being performed on the source, such as helping to determine whether it is preferable to split further or traverse the remaining elements sequentially .

Since your spliterator does not have the SIZED characteristic estimateSize will not offer any performance(because of no parallelism ), However keep in mind that Java-docs of estimateSize doesn't mention anything of parallelism ,all it states is:

Returns: the estimated size, or Long.MAX_VALUE if infinite, unknown, or too expensive to compute.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM