简体   繁体   中英

What is lock-free multithreaded programming?

I have seen people/articles/SO posts who say they have designed their own "lock-free" container for multithreaded usage. Assuming they haven't used a performance-hitting modulus trick (ie each thread can only insert based upon some modulo) how can data structures be multi-threaded but also lock-free???

This question is intended towards C and C++.

The key in lock-free programming is to use hardware-intrinsic atomic operations.

As a matter of fact, even locks themselves must use those atomic operations!

But the difference between locked and lock-free programming is that a lock-free program can never be stalled entirely by any single thread. By contrast, if in a locking program one thread acquires a lock and then gets suspended indefinitely, the entire program is blocked and cannot make progress. By contrast, a lock-free program can make progress even if individual threads are suspended indefinitely.

Here's a simple example: A concurrent counter increment. We present two versions which are both "thread-safe", ie which can be called multiple times concurrently. First the locked version:

int counter = 0;
std::mutex counter_mutex;

void increment_with_lock()
{
    std::lock_guard<std::mutex> _(counter_mutex);
    ++counter;
}

Now the lock-free version:

std::atomic<int> counter(0);

void increment_lockfree()
{
    ++counter;
}

Now imagine hundreds of thread all call the increment_* function concurrently. In the locked version, no thread can make progress until the lock-holding thread unlocks the mutex. By contrast, in the lock-free version, all threads can make progress . If a thread is held up, it just won't do its share of the work, but everyone else gets to get on with their work.

It is worth noting that in general lock-free programming trades throughput and mean latency throughput for predictable latency. That is, a lock-free program will usually get less done than a corresponding locking program if there is not too much contention (since atomic operations are slow and affect a lot of the rest of the system), but it guarantees to never produce unpredictably large latencies.

For locks, the idea is that you acquire a lock and then do your work knowing that nobody else can interfere, then release the lock.

For "lock-free", the idea is that you do your work somewhere else and then attempt to atomically commit this work to "visible state", and retry if you fail.

The problems with "lock-free" are that:

  • it's hard to design a lock-free algorithm for something that isn't trivial. This is because there's only so many ways to do the "atomically commit" part (often relying on an atomic "compare and swap" that replaces a pointer with a different pointer).
  • if there's contention, it performs worse than locks because you're repeatedly doing work that gets discarded/retried
  • it's virtually impossible to design a lock-free algorithm that is both correct and "fair". This means that (under contention) some tasks can be lucky (and repeatedly commit their work and make progress) and some can be very unlucky (and repeatedly fail and retry).

The combination of these things mean that it's only good for relatively simple things under low contention.

Researchers have designed things like lock-free linked lists (and FIFO/FILO queues) and some lock-free trees. I don't think there's anything more complex than those. For how these things work, because it's hard it's complicated. The most sane approach would be to determine what type of data structure you're interested in, then search the web for relevant research into lock-free algorithms for that data structure.

Also note that there is something called "block free", which is like lock-free except that you know you can always commit the work and never need to retry. It's even harder to design a block-free algorithm, but contention doesn't matter so the other 2 problems with lock-free disappear. Note: the "concurrent counter" example in Kerrek SB's answer is not lock free at all, but is actually block free.

The idea of "lock free" is not really not having any lock, the idea is to minimize the number of locks and/or critical sections, by using some techniques that allow us not to use locks for most operations.

It can be achieved using optimistic design or transactional memory , where you do not lock the data for all operations, but only on some certain points (when doing the transaction in transactional memory, or when you need to roll-back in optimistic design).

Other alternatives are based on atomic implementations of some commands, such as CAS (Compare And Swap), that even allows us to solve the consensus problem given an implementation of it. By doing swap on references (and no thread is working on the common data), the CAS mechanism allows us to easily implement a lock-free optimistic design (swapping to the new data if and only if no one have changed it already, and this is done atomically).

However, to implement the underlying mechanism to one of these - some locking will most likely be used, but the amount of time the data will be locked is (supposed) to be kept to minimum, if these techniques are used correctly.

The new C and C++ standards (C11 and C++11) introduced threads, and thread shared atomic data types and operations. An atomic operation gives guarantees for operations that run into a race between two threads. Once a thread returns from such an operation, it can be sure that the operation has gone through in its entirety.

Typical processor support for such atomic operations exists on modern processors for compare and swap (CAS) or atomic increments.

Additionally to being atomic, data type can have the "lock-free" property. This should perhaps have been coined "stateless", since this property implies that an operation on such a type will never leave the object in an intermediate state, even when it is interrupted by an interrupt handler or a read of another thread falls in the middle of an update.

Several atomic types may (or may not) be lock-free, there are macros to test for that property. There is always one type that is guaranteed to be lock free, namely atomic_flag .

you have the interlocked library in windows http://msdn.microsoft.com/en-us/library/windows/desktop/ms683590(v=vs.85).aspx this considered lock free.

* il just quote from msdn as im no big expert: but basicly if hardware premits it will provide you an atomic operation that that handles this "lockfree" otherwise it will provide a lock around it use memory barries technique *

here are several variations on _InterlockedExchange that vary based on the data types they involve and whether processor-specific acquire or release semantics is used. While the _InterlockedExchange function operates on 32-bit integer values, _InterlockedExchange64 operates on 64-bit integer values. The _InterlockedExchange_acq and _InterlockedExchange64_acq intrinsic functions are the same as the corresponding functions without the _acq suffix except that the operation is performed with acquire semantics, which is useful when entering a critical section. There is no version of this function that uses release semantics. In Visual C++ 2005, these functions behave as read-write memory barriers. For more information, see _ReadWriteBarrier. These routines are only available as intrinsics.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM