简体   繁体   中英

Should I use shared mutable variable update in Java 8 Streams

Just iterating below list & adding into another shared mutable list via java 8 streams.

List<String> list1 = Arrays.asList("A1","A2","A3","A4","A5","A6","A7","A8","B1","B2","B3");
List<String> list2 = new ArrayList<>();

Consumer<String> c = t -> list2.add(t.startsWith("A") ? t : "EMPTY");

list1.stream().forEach(c);
list1.parallelStream().forEach(c);
list1.forEach(c);

What is the difference between above three iteration & which one we need to use. Are there any considerations?

Regardless of whether you use parallel or sequential Stream , you shouldn't use forEach when your goal is to generate a List . Use map with collect :

List<String> list2 = 
    list2.stream()
         .map(item -> item.startsWith("A") ? item : "EMPTY")
         .collect(Collectors.toList());

Functionally speaking,for the simple cases they are almost the same, but generally speaking, there are some hidden differences:

  1. Lets start by quoting from Javadoc of forEach for iterable use-cases stating that:

performs the given action for each element of the Iterable until all elements have been processed or the action throws an exception.

and also we can iterate over a collection and perform a given action on each element – by just passing a class that implements the Consumer interface

void forEach(Consumer<? super T> action)

https://docs.oracle.com/javase/8/docs/api/java/lang/Iterable.html#forEach-java.util.function.Consumer-


  1. The order of Stream.forEach is random while Iterable.forEach is always executed in the iteration order of the Iterable .

  1. If Iterable.forEach is iterating over a synchronized collection, Iterable.forEach takes the collection's lock once and holds it across all the calls to the action method. The Stream.forEach call uses the collection's spliterator, which does not lock

  1. The action specified in Stream.forEach is required to be non-interfering while Iterable.forEach is allowed to set values in the underlying ArrayList without problems.

  1. In Java, Iterators returned by Collection classes, eg ArrayList, HashSet, Vector, etc., are fail fast. This means that if you try to add() or remove() from the underlying data structure while iterating it, you get a ConcurrentModificationException.

https://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html#fail-fast


More Info:

I personally believe that when working with streams, you should write your code in a way that if you switch to parallel streams, it does not break your code (produce wrong results). Imagine if in your code you were doing reading and writing on the same shared memory (list2) and you distribute your process into several threads (using parallel streams). Then you are DOOMED. Therefore you have several options.

make your shared memory (list2) thread safe. for example by using AtomicReferences

List<String> list2 = new ArrayList<>();
AtomicReference<List<String>> listSafe = new AtomicReference<>();
listSafe.getAndUpdate(strings -> {strings.add("newvalue"); return strings;}); 

or you can go with the purely functional approach (code with no side effect) like the @Eran solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM