[英]java 8 parallel stream confusion/issue
I am new to parallel stream and trying to make 1 sample program that will calculate value * 100(1 to 100) and store it in map. 我是并行流的新手,并试图制作一个计算值* 100(1到100)并将其存储在地图中的示例程序。 While executing code I am getting different count on each iteration.
在执行代码时,每次迭代都会有不同的数量。 I may be wrong at somewhere so please guide me anyone knows the proper way to do so.
我可能在某个地方错了所以请指导我,任何人都知道正确的方法。
code : 代码 :
import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.stream.Collectors;
public class Main{
static int l = 0;
public static void main (String[] args) throws java.lang.Exception {
letsGoParallel();
}
public static int makeSomeMagic(int data) {
l++;
return data * 100;
}
public static void letsGoParallel() {
List<Integer> dataList = new ArrayList<>();
for(int i = 1; i <= 100 ; i++) {
dataList.add(i);
}
Map<Integer, Integer> resultMap = new HashMap<>();
dataList.parallelStream().map(f -> {
Integer xx = 0;
{
xx = makeSomeMagic(f);
}
resultMap.put(f, xx);
return 0;
}).collect(Collectors.toList());
System.out.println("Input Size: " + dataList.size());
System.out.println("Size: " + resultMap.size());
System.out.println("Function Called: " + l);
}
}
Last Output 最后输出
Input Size: 100
输入大小:100
Size: 100
尺寸:100
Function Called: 98
功能称为:98
On each time run output differs. 每次运行输出都不同。 I want to use parallel stream in my own application but due to this confusion/issue I can't.
我想在我自己的应用程序中使用并行流,但由于这种混淆/问题我不能。 In my application I have 100-200 unique numbers on which some same operation needs to be performed.
在我的应用程序中,我有100-200个唯一编号,需要执行相同的操作。 In short there's function which process something.
简而言之,它有处理某些东西的功能。
Your access to both the HashMap
and to the l
variable are both not thread safe, which is why the output is different in each run. 您对
HashMap
和l
变量的访问都不是线程安全的,这就是每次运行时输出不同的原因。
The correct way to do what you are trying to do is collecting the Stream
elements into a Map
: 执行您要执行的操作的正确方法是将
Stream
元素收集到Map
:
Map<Integer, Integer> resultMap =
dataList.parallelStream()
.collect(Collectors.toMap (Function.identity (), Main::makeSomeMagic));
EDIT: The l
variable is still updated in a not thread safe way with this code, so you'll have to add your own thread safety if the final value of the variable is important to you. 编辑:使用此代码仍然以非线程安全的方式更新
l
变量,因此如果变量的最终值对您很重要,则必须添加自己的线程安全性。
By putting some values in resultMap
you're using a side-effect : 通过在
resultMap
放置一些值,您将使用副作用 :
dataList.parallelStream().map(f -> {
Integer xx = 0;
{
xx = makeSomeMagic(f);
}
resultMap.put(f, xx);
return 0;
})
Stateless operations, such as filter and map, retain no state from previously seen element when processing a new element -- each element can be processed independently of operations on other elements.
无状态操作(例如过滤器和映射)在处理新元素时不保留先前看到的元素的状态 - 每个元素都可以独立于其他元素上的操作进行处理。
Stream pipeline results may be nondeterministic or incorrect if the behavioral parameters to the stream operations are stateful.
如果流操作的行为参数是有状态的,则流管道结果可能是不确定的或不正确的。 A stateful lambda (or other object implementing the appropriate functional interface) is one whose result depends on any state which might change during the execution of the stream pipeline.
有状态lambda(或实现适当功能接口的其他对象)的结果取决于在流管道执行期间可能发生变化的任何状态。
It follows an example similar to yours showing: 它遵循一个类似于你的例子显示:
... if the mapping operation is performed in parallel, the results for the same input could vary from run to run, due to thread scheduling differences, whereas, with a stateless lambda expression the results would always be the same.
...如果映射操作是并行执行的,由于线程调度的差异,相同输入的结果可能因运行而异,而对于无状态lambda表达式,结果将始终相同。
That explains your observation: On each time run output differs. 这解释了你的观察: 每次运行输出都不同。
Hopefully it works fine. 希望它工作正常。 by making
Synchronied
function makeSomeMagic
and using Threadsafe data structure ConcurrentHashMap
and write simple statement 通过制作
Synchronied
函数makeSomeMagic
并使用Threadsafe数据结构ConcurrentHashMap
并编写简单语句
dataList.parallelStream().forEach(f -> resultMap.put(f, makeSomeMagic(f)));
Whole code is here : 整个代码在这里:
import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.stream.Collectors;
public class Main{
static int l = 0;
public static void main (String[] args) throws java.lang.Exception {
letsGoParallel();
}
public synchronized static int makeSomeMagic( int data) { // make it synchonized
l++;
return data * 100;
}
public static void letsGoParallel() {
List<Integer> dataList = new ArrayList<>();
for(int i = 1; i <= 100 ; i++) {
dataList.add(i);
}
Map<Integer, Integer> resultMap = new ConcurrentHashMap<>();// use ConcurrentHashMap
dataList.parallelStream().forEach(f -> resultMap.put(f, makeSomeMagic(f)));
System.out.println("Input Size: " + dataList.size());
System.out.println("Size: " + resultMap.size());
System.out.println("Function Called: " + l);
}
}
Stream
will help you do loop in byte code. Stream
将帮助您循环使用字节代码。 Stream
, do not use no thread-safe variable in multi-thread(include parallelStream
) Stream
,不要在多线程中使用没有线程安全的变量(包括parallelStream
) like this. 像这样。
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class ParallelStreamClient {
// static int l = 0;---> no need to count times.
public static void main(String[] args) throws java.lang.Exception {
letsGoParallel();
}
public static int makeSomeMagic(int data) {
// l++;-----> this is no thread-safe way
return data * 100;
}
public static void letsGoParallel() {
List<Integer> dataList = new ArrayList<>();
for (int i = 1; i <= 100; i++) {
dataList.add(i);
}
Map<Integer, Integer> resultMap =
dataList.parallelStream().collect(Collectors.toMap(i -> i,ParallelStreamClient::makeSomeMagic));
System.out.println("Input Size: " + dataList.size());
System.out.println("Size: " + resultMap.size());
//System.out.println("Function Called: " + l);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.