简体   繁体   English

在Java中从多个线程读取(不修改)不是线程安全的对象(如链接列表)是否安全?

[英]Is it safe in Java to read (not modify) objects which are not thread safe (like linked list) from multiple threads?

there was already a question whether threads can simultaneously safely read/iterate LinkeList . 已经存在一个问题,线程是否可以同时安全地读取/迭代LinkeList It seems the answer is yes as far as no-one structurally changes it (add/delete) from the linked list. 似乎答案是肯定的,因为没有人从链接列表中对其进行结构更改(添加/删除)。

Although one answer was warning about "unflushed cache" and advicing to know "java memory model". 尽管一个答案是关于“未刷新的缓存”的警告,并建议了解“ java内存模型”。 So I'm asking to elaborate those "evil" caches. 所以我要详细说明那些“邪恶的”缓存。 I'm a newbie and so far I still naively believe that following code is ok (at least from my tests) 我是新手,到目前为止,我仍然天真的相信以下代码是可以的(至少从我的测试中可以看出)

public static class workerThread implements Runnable {
    LinkedList<Integer> ll_only_for_read;
    PrintWriter writer;
    public workerThread(LinkedList<Integer> ll,int id2) throws Exception {
        ll_only_for_read = ll;
        writer = new PrintWriter("file."+id2, "UTF-8");
    }
    @Override
    public void run() {
        for(Integer i : ll_only_for_read) writer.println(" ll:"+i);
        writer.close();
    }
}

public static void main(String args[]) throws Exception{
    LinkedList<Integer> ll = new LinkedList<Integer>();
    for(int i=0;i<1e3;i++) ll.add(i);
    // do I need to call something special here? (in order to say:
    // "hey LinkeList flush all your data from local cache
    // you will be now a good boy and share those data among
    // whole lot of interesting threads. Don't worry though they will only read
    // you, no thread would dare to change you"
    new Thread(new workerThread(ll,1)).start();
    new Thread(new workerThread(ll,2)).start();
}

Yes, in your specific example code it's okay, since the act of creating the new thread should define a happens-before relationship between populating the list and reading it from another thread." There are plenty of ways that a seemingly-similar set up could be unsafe, however. 是的,在您的特定示例代码中,这没关系,因为创建新线程的行为应在填充列表和从另一个线程读取列表之间定义一个事前发生的关系。”有许多种看似相似的设置方法是不安全的。

I highly recommend reading "Java Concurrency in Practice" by Brian Goetz et al for more details. 我强烈建议阅读Brian Goetz等人的“实践中的Java并发性”以获取更多详细信息。

Although one answer was warning about "unflushed cache" and advicing to know "java memory model". 尽管一个答案是关于“未刷新的缓存”的警告,并建议了解“ java内存模型”。

I think you are referring to my Answer to this Question: Can Java LinkedList be read in multiple-threads safely? 我认为您是指我对这个问题的答案: Java LinkedList可以安全地在多线程中读取吗? .

So I'm asking to elaborate those "evil" caches. 所以我要详细说明那些“邪恶的”缓存。

They are not evil. 他们不是邪恶的。 They are just a fact of life ... and they affect the correctness (thread-safety) reasoning for multi-threaded applications. 它们只是生活中的事实……它们会影响多线程应用程序的正确性(线程安全性)推理。

The Java Memory Model is Java's answer to this fact of life. Java内存模型是Java对这一现实生活的回答。 The memory model specifies with mathematical precision a bunch of rules that need to be obeyed to ensure that all possible executions of your application are "well-formed". 内存模型以数学精度指定了一堆规则,需要遵循这些规则以确保应用程序的所有可能执行都是“格式正确的”。 (In simple terms: that your application is thread-safe.) (简单来说:您的应用程序是线程安全的。)

The Java Memory Model is ... difficult. Java内存模型非常困难。

Someone recommended "Java Concurrency in Practice" by Brian Goetz et al. 有人推荐Brian Goetz等人的“实践中的Java并发性”。 I concur. 我同意。 It is the best textbook on the topic of writing "classic" Java multi-threaded applications, and it has a good explanation of the Java Memory Model. 这是有关编写“经典” Java多线程应用程序的最佳教科书,并且对Java内存模型有很好的解释。

More importantly, Goetz et al gives you a simpler set of rules that are sufficient to give you thread-safety. 更重要的是,Goetz等人为您提供了一组简单的规则, 足以为您提供线程安全。 These rules are still too detailed to condense into StackOverflow answer ... but 这些规则仍然太详细,无法浓缩到StackOverflow答案中……

  • one of the concepts is "safe publication", and 概念之一是“安全发布”,并且
  • one of the principles is to use / re-use existing concurrency constructs rather than to roll your own concurrency mechanisms based on the Memory Model. 原则之一是使用/重用现有的并发构造,而不是基于内存模型滚动自己的并发机制。

I'm a newbie and so far I still naively believe that following code is ok. 我是新手,到目前为止,我仍然天真的相信以下代码是可以的。

It >>is<< correct. >> <<正确。 However ... 但是...

(at least from my tests) (至少根据我的测试)

... testing is NOT a guarantee of anything. ...测试不是任何保证。 The problem with non-thread-safe programs is that the faults are frequently not revealed by testing because they manifest randomly, with low probability, and often differently on different platforms. 非线程安全程序的问题在于,错误通常不会通过测试发现,因为它们随机地,以低概率出现,并且在不同的平台上通常是不同的。

You cannot rely on testing to tell you that your code is thread-safe. 您不能依靠测试来告诉您代码是线程安全的。 You need to reason 1 about the behaviour ... or follow a set of well-founded rules. 您需要对行为进行推理1或遵循一系列合理的规则。


1 - And I mean real, well-founded reasoning ... not seat-of-the-pants intuitive stuff. 1-我的意思是真实的,有充分根据的推理……不是说坐直觉的东西。

If your code created and populated the list with a single thread and only in a second moment you create other threads that concurrently access the list there is no problem. 如果您的代码使用单个线程创建并填充了列表,并且仅在第二秒内创建了同时访问列表的其他线程,就没有问题。

Only when a thread can modify a value while other threads try to read the same value can happens problems. 只有当一个线程可以修改一个值而其他线程试图读取相同的值时,才会发生问题。

It can be a problem if you change the object you retrieve (also if you don't change the list itself). 如果您更改检索到的对象(也可能不更改列表本身),则可能会出现问题。

The way you're using it is fine, but only by coincidence. 您使用它的方式很好,但仅出于巧合。

Programs are rarely that trivial: 程序很少那么琐碎:

  • If the List contains references to other (mutable) data, then you'll get race conditions. 如果列表包含对其他(可变)数据的引用,那么您将获得竞争条件。
  • If someone modifies your 'reader' threads later in the code's lifecycle, then you'll get races. 如果有人在代码生命周期的后期修改了您的“阅读器”线程,那么您将获得竞争。

Immutable data (and data structures) are by definition thread-safe. 根据定义,不可变数据(和数据结构)是线程安全的。 However , this is a mutable List, even though you're making the agreement with yourself that you won't modify it. 但是 ,这是一个可变的列表,即使您与自己达成协议,也不会修改它。

I'd recommend wrapping the List<> instance like this so the code fails immediately if someone tries to use any mutators on the List: 我建议像这样包装List<>实例,这样,如果有人尝试使用List上的任何变体,代码就会立即失败:

List<Integer> immutableList = Collections.unmodifiableList(ll); //...pass 'immutableList' to threads.

Link to unmodifiableList 链接到unmodifiableList

It depends on how the object was created and made available to your thread. 这取决于如何创建对象并将其提供给线程。 In general, no, it's not safe, even if the object isn't modified. 通常,不,即使不修改对象也不安全。

Following are some ways to make it safe. 以下是一些使其安全的方法。

First, create the object and perform any modification that is necessary; 首先,创建对象并进行必要的修改; you can consider the object to be effectively immutable if no more modifications occur. 如果没有更多修改,您可以认为该对象是有效不变的 Then, share the effectively immutable object with other threads by one of the following means: 然后,通过以下方式之一与其他线程共享有效的不可变对象:

  • Have other threads read the object from a field that is volatile . 让其他线程从volatile字段读取对象。
  • Write a reference to the object inside a synchronized block, then have other threads read that reference while synchronized on the same lock. 写一个对synchronized块内对象的引用,然后让其他线程在同一锁上synchronized时读取该引用。
  • Start the reading threads after the object is initialized, passing the object as a parameter. 在对象初始化后启动读取线程,将对象作为参数传递。 (This is what you are doing in your example, so you are safe.) (这就是您在示例中所做的,因此很安全。)
  • Pass the object between threads using a concurrent mechanism like a BlockingQueue implementation, or publish it in a concurrent collection, like a ConcurrentMap implementation. 使用并发机制(例如BlockingQueue实现)在线程之间传递对象,或将其发布在并发集合中(例如ConcurrentMap实现)。

There might be others. 可能还有其他。 Alternatively, you can make all of the fields of the shared object final (including all the fields of its Object members, and so on). 或者,您可以使共享库的所有字段都为final (包括其Object成员的所有字段,依此类推)。 Then it will be safe to share this object by any means across threads. 这样,跨线程以任何方式共享该对象将是安全的。 That's one of the under-appreciated virtues of immutable types. 这是不可变类型的被低估的优点之一。

You need to guarantee happens-before relationship between reads and writes in your LinkedList because they are done in separate threads. 您需要保证LinkedList中的读写之间发生事前关系,因为它们是在单独的线程中完成的。

Result of ll.add(i) will be visible for new workerThread because Thread.start forms happens-before relationship. ll.add(i)结果对于新的workerThread将是可见的,因为Thread.start形式发生在关系之前。 So your example is thread safe. 因此,您的示例是线程安全的。 See more about happens-before conditions . 请参阅有关条件发生之前的更多信息。

However be aware of more complex situation, when LinkedList is read during iteration in worker threads and at the same time it is modified by the main thread. 但是,请注意更复杂的情况,即在工作线程中的迭代期间读取LinkedList并同时由主线程对其进行修改时。 Like this: 像这样:

for(int i=0;i<1e3;i++) {
    ll.add(i);
    new Thread(new workerThread(ll,1)).start();
    new Thread(new workerThread(ll,2)).start();
}

This way ConcurrentModificationException is possible. 这样就可以实现ConcurrentModificationException。

There are several options: 有几种选择:

  1. Clone your LinkedList inside of workerThread and iterate the copy instead. 在workerThread内克隆您的LinkedList,然后迭代该副本。
  2. Use synchronization both for list modification and for list iteration (but it will lead to poor concurrency). 将同步用于列表修改和列表迭代(但会导致差的并发性)。
  3. Instead of LinkedList use CopyOnWriteArrayList . 代替LinkedList使用CopyOnWriteArrayList

Sorry for answering to my question. 很抱歉回答我的问题。 But I was thinking of your reassuring answers and I found it may not be so safe as it seems. 但是我想到的是您令人放心的答案,我发现它可能看起来并不安全。 I found and tested case when it is not working - if object would use it's class variable for storing any data (I wouldn't know about) then it would fail (then the only question is if linked list (and other java classes) in some implementation can do it...) See failing example: 我发现并测试了它不起作用的情况-如果对象将使用它的类变量来存储任何数据(我不知道),那么它将失败(然后唯一的问题是链表(和其他Java类)是否在一些实现可以做到这一点...)请参见失败的示例:

public class DummyLinkedList {
    public LinkedList<Integer> ll;
    public DummyLinkedList(){
        ll = new LinkedList<Integer>();
    }
    int lastGetIndex;
    int myDummyGet(int idx){
        lastGetIndex = idx;
        //return ll.get(idx); // thids would work fine as parameter is on the stack so uniq for each call (at least if java supports reentrant functions)
        return ll.get(lastGetIndex); // this would make a problem even for only readin the object - question is how many such issues java.* contains
    }
}

If you only access to the list is by 'read' methods (including iterations) then you are fine. 如果您仅通过“读取”方法(包括迭代)访问列表,则可以。 Like in your code. 就像在您的代码中一样。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM