简体   繁体   中英

What is the most computationally efficient way to flatmap a List of Lists?

There is a problem on my server where it became a bottle neck due to a specific problem to solve resolving a List<List<SomeObject>> into a List<SomeObject> . The CPU of the server spiked above normal means.

DataStructure is:

Object:
List<SomeObject> childList;

Trying to make a List<Object> flatmapped to List<SomeObject> in the most computationally efficient way. If parentList = List<Object> : I Tried:

parentList.stream().flatMap(child -> child.getChildList().stream()).collect(Collectors.toList())

Also tried:

List<Object> all = new ArrayList<>();
parentList.forEach(child -> all.addAll(child.getChildList()))

Any other suggestions? These seem to be similar in computation but pretty high due to copying underneath the hood.

This may be more efficient since it eliminates creating multiple streams via flatMap. MapMulti was introduced in Java 16. It takes the streamed argument and a consumer which puts something on the stream, in this case each list's object.

List<List<Object>> lists =  new ArrayList<>(
                List.of(List.of("1", "2", "3"),
                List.of("4", "5", "6", "7"),
                List.of("8", "9")));


List<Object> list = lists.stream().mapMulti(
         (lst, consumer) -> lst.forEach(consumer))
         .toList();

System.out.print(list);

prints

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In java 8

List<Object> listOne = new ArrayList<>();
List<Object> listTwo = new ArrayList<>();
List<Object> listThree = new ArrayList<>(); 
...  

Stream.of(...) concatenate many lists

List<Object> newList = Stream.of(listOne,listTwo,listThree).flatMap(Collection::stream).collect(Collectors.toList());

In Java 16+

List<Object> newList=Stream.concat(Stream.concat(listOne, listTwo), listThree).toList();

Being an ETL (“Extract Transform and Load”) process, Streams processes collections of data using multiple threads of execution at each stage of processing.

Do we know more about which List implementation is used?

I would try to init the resulting list with the correct expected size. This avoids unnecessary copying. This assumes that the size of the lists can be retrieved fast.

int expectedSize = parentList.stream()
                             .mapToInt(entry -> entry.getChildList().size())
                             .sum();
List<SomeObject> result = new ArrayList<>(expectedSize);
for (var entry : parentList) {
   result.addAll(entry.getChildList());
}

One way to make the flat mapping more computationally efficient is to use a for loop instead of the stream API or forEach method. The for loop would iterate over the parent list, and for each element, it would add the child list to the flat list. This avoids the overhead of creating streams and using the collect method. Additionally, using an ArrayList to store the flat list instead of a LinkedList can also improve performance since it has a more efficient implementation of the addAll method.

List<SomeObject> flatList = new ArrayList<>();
for (Object o : parentList) {
flatList.addAll(o.getChildList());

Another way would be to use an iterator. Iterator is an interface for traversing a collection and it's more efficient than forEach or for loop.

List<SomeObject> flatList - new ArrayList<>();
Iterator<Object> iterator = parentList.iterator();
while(iterator.hasNext()){
Object o = iterator.next():
flatList.addAll(o.getChildList()):
}

You could also use the concat method for List, which concatenates two lists in an efficient way and results in a new list.

List<SomeObject> flatList = new ArrayList<>()
for (Object o : parentList){
flatList.concat(o.getChildList());
}

THERE ARE SERVERAL RESOURCES THAT YOU CAN USE FOR ADDITIONAL READING ON THIS TOPIC. HERE ARE A FEW THAT I WOULD RECOMMEND. https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/List.html

https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/ArrayList.html

https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/Iterator.html

https://www.oreilly.com/library/view/java-performance-the/9781449358652/

https://www.tutorialspoint.com/java_data_structure_algorithms/index.htm

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM