java并发写入集合，然后读取-结果不一致

Question

我从这里读到，有几个不同的Set线程安全选项。 在我的应用程序中，我有10个线程同时将内容添加到一个集合中（不必设置，但更好）。 所有线程完成后，我需要遍历集合。

我读到ConcurrentSkipListSet和Collections.newSetFromMap（new ConcurrentHashMap（））都具有不一致的批处理操作（addAll，removeAll等）和迭代器。 我的实验也证实了这一点。 当我使用ConcurrentSkipListSet时，在所有线程相加之后，读数有些随机。 我随机得到了不同大小的集合。

然后，我尝试了Collections.synchronizedSet（new HashSet <>（）），我认为它应该是线程安全的，因为它同时阻止了多个写访问。 但是，似乎有相同的阅读不一致问题。 我仍然在结果集中随机获得不同的大小。

我应该怎么做才能确保读数一致？ 如前所述，我不必使用Set。 只要可以避免重复添加，我就可以使用列表或其他列表

由于代码是非常大的软件包的一部分，因此显示代码有些困难。 但总的来说看起来像这样

public class MyRecursiveTask extends RecursiveTask<Integer> {
    private List<String> tasks; 
    protected ConcurrentSkipListSet<String> dictionary;
    public MyRecursiveTask(ConcurrentSkipListSet<String> dictionary,
                           List<String> tasks){
        this.dictionary=dictionary;
        this.tasks=tasks;
    }

    protected Integer compute() {
        if (this.tasks.size() > 100) {
            List<RecursiveFeatureExtractor> subtasks =
                new ArrayList<>();
            subtasks.addAll(createSubtasks());
            int count=0;
            for (MyRecursiveTask subtask : subtasks)
                subtask.fork();
            for (MyRecursiveTask subtask : subtasks)
                count+=subtask.join();
            return count;
        } else {
            int count=0;
            for (File task: tasks) {
                    // code to process task
                 String outcome = [method to do some task]
                 dictionary.add(outcome);
                 count++;
            }
            return count;
        }
    }

    private List<MyRecursiveTask> createSubtasks() {
        List<MyRecursiveTask> subtasks =
            new ArrayList<>();

        int total = tasks.size() / 2;
        List<File> tasks1= new ArrayList<>();
        for (int i = 0; i < total; i++)
            tasks1.add(tasks.get(i));
        MyRecursiveTask subtask1 = new MyRecursiveTask(
            dictionary, tasks1);

        List<File> tasks2= new ArrayList<>();
        for (int i = total; i < tasks.size(); i++)
            tasks2.add(tasks.get(i));
        MyRecursiveTask subtask2 = new MyRecursiveTask(
            dictionary, tasks2);

        subtasks.add(subtask1);
        subtasks.add(subtask2);

        return subtasks;
    }
}

然后，代码创建此类线程工人的列表：

....
List<String> allTasks = new ArrayList<String>(100000);
....
//code to fill in "allTasks"
....

ConcurrentSkipListSet<String> dictionary = new ConcurrentSkipListSet<>();
//I also tried "dictionary = Collections.Collections.synchronizedSet(new 
//HashSet<>())" and changed other bits of code accordingly. 
ForkJoinPool forkJoinPool = new ForkJoinPool(10);
MyRecursiveTask mrt = new MyRecursiveTask (dictionary,
            );
int total= forkJoinPool.invoke(mrt);
System.out.println(dictionary.size()); //this value is a bit random. If real     
//size should be 999, when I run the code once i may get 989; second i may 
//get 999; third I may get 990 etc....

谢谢

Answer 1

不看代码，很难说出问题所在。 我猜想，读取结果的线程在某些线程仍在编写时运行得还为时过早。 使用Thread.join等待作者。 Collections.synchronizedSet当然是线程安全的。

考虑一下Javadoc ：

当用户遍历返回的集合时，必须手动对其进行同步：

   Set s = Collections.synchronizedSet(new HashSet());
       ...   synchronized (s) {
       Iterator i = s.iterator(); // Must be in the synchronized block
       while (i.hasNext())
           foo(i.next());   }

不遵循此建议可能导致不确定的行为。 如果指定的集合是可序列化的，则返回的集合将是可序列化的。

java并发写入集合，然后读取-结果不一致

问题描述

1 个解决方案

解决方案1
1 2015-08-10 20:08:24

java并发写入集合，然后读取-结果不一致

问题描述

1 个解决方案

解决方案1 1 2015-08-10 20:08:24

解决方案1
1 2015-08-10 20:08:24