简体   繁体   中英

How Copy-On-Write is different from a direct lock / synchronized on write method?

Copy-On-Write is considered as one of the good practices in concurrency scenarios. However, I am not clear on how it is different from a simple lock /synchronized on the write method. Could anyone help explain this?

Copy-On-Write:

    public V put(K key, V value) {
        synchronized (this) {
            Map<K, V> newMap = new HashMap<K, V>(internalMap);
            V val = newMap.put(key, value);
            internalMap = newMap;
            return val;
        }
    }

Direct lock / synchronized:

    public V put(K key, V value) {
        synchronized (this) {
            internalMap.put(key, value);
        }
    }

For write threads, they are mutually excluded in above 2 examples, same.

For read threads, in Copy-On-Write, read actions after "internalMap = newMap" is run will get the updated value. And in Direct lock, read actions after "internalMap.put(key, value)" is run will get the updated value, kind of same.

So why are we promoting Copy-On-Write? Why we have to "copy" when write?

One benefit in this example is that you get snapshot semantics for the copy on write case: every reference to internalMap is immutable and will not change anymore once obtained. This can be beneficial when you have many concurrent read operations traversing internalMap and only occasional updates.

Both using a lock and copy-on-write achieve (practically) the same functionality. None of them are inherently better than the other.

In general, copy-on-write performs better when there are lots of reads, but very little writes. This is because on average, reads are cheaper than when using a lock, while writes are more expensive due to the copying. When you have a lot of writes, it is usually better to use a lock.

Why the writes are more expensive is probably obvious (you have to copy the whole map on every write, duh). The reason reads are cheaper is as follows:

volatile Map<K, V> internalMap = new HashMap<>();

Reading the internalMap does not require acquiring a lock (for more details, see Difference between volatile and synchronized in Java ). Once threads have obtained a reference to the internalMap , they can just keep working on that copy (eg iterating through the entries) without coordinating with other threads because it is guaranteed it won't be mutated. As many threads as necessary can work off a single copy (snapshot) of the map.

To explain by analogy, imagine an author is drafting an article and they have a few people working as their fact checkers. With a lock, only one of them can work on the draft. With copy on write, the author posts an immutable snapshot (copy) to somewhere, which the fact checkers can grab and do their work - while they do their work, they can read the snapshot as needed (rather than interrupting the author every time they forgot parts of the article etc).

Java's lock has improved over the years and hence the difference is small, but under extreme conditions not having to acquire locks / not having to coordinate between threads can result in higher throughput etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM