简体   繁体   中英

Order of intermediate operations

Does the order in which the intermadiate functions are called have an impact on the performance of streams?

Example 1)

List<Item> myList = someItems;

myList.stream().filter(Item::isGreen).sorted(Comparator.comparing(Item::getSomeValue)).....

in comparison to

myList.stream().sorted(Comparator.comparing(Item::getSomeValue)).filter(Item::isGreen).....

Example 2)

myList.stream().filter(Item::isGreen).distinct()...

in comparison to

myList.stream().distinct().filter(Item::isGreen)...

Example 3)

 Stream.of(42,13,29,23,7,888).sorted().map(i -> new Item(i))...

in comparison to

Stream.of(42,13,29,23,7,888).map(i -> new Item(i)).sorted(Comaparator.comparingInt(Item::getSomeInt))..

i suppose with the first example it makes more sense to filter first, since it is not the whole list but a relatively shorter list that has to be sorted, depending on the filter afterwards.

but with the second and third one i'm not sure what the best order is? I'm not into some unnecessary or premature optimization, I'm just curious if the operations are called in the order they appear and if so if this has an impact on performance or if there is an optimised call internally in the compiler.

Yes, it does matter.

Example 1

i suppose with the first example it makes more sense to filter first, since it is not the whole list but a relatively shorter list that has to be sorted, depending on the filter afterwards.

Correct. Filtering elements out before sorting makes the sorting more performant and this can make a big difference if many elements are filtered out.

Example 2

This depends on the implementation of Item and what your data actually consists of. If filter() filters out many elements, you should put that before. However, if isGreen would be a costly operation and you had many duplicate elements, you sjould eliminate those duplicates at first.

Example 3

Sorting the Integer objects directly is likely more performant than getting the integer values of every element as tgis would habe to go through the Comparator while the Integer objects could directly be compared and the JIT could potentially do some more optimizations with the Integer objects. However, you might want to use IntStream so you can use the int s directly instead of caring about the wrapper objects and this could technically also use more sophisticated sorting algorithms like RadixSort.

In general

It is likely better to filter out elements before doing intensive operations like sorting.

Keep in mind that you shouldn't optimize prematurely (you seem to know that, though) and always test performance gains.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM