How do you get around the ABA problem when using mwaitx?

Question

AMD's mwaitx instruction allows you to wait for an address to change, but it has a limited duration. There's no way to tell if it woke up because the value changed or because of an interrupt.

You can always inspect the address to see if it's changed, but this leads to the ABA problem where it could've changed, then changed back.

This can lead to an issue where you might want to send an update request for a data block if it's associated lock was accessed, but if the lock is acquired, used, then released, the data structure changed but the value of the lock doesn't APPEAR to have changed, and the thread using mwaitx isn't aware that the lock was accessed.

Is there any workaround for this or am I stuck?

Answer 1

The idea is to use mwaitx as an optimisation, not as your sole synchronisation primitive. For this use case, it doesn't matter if it could sporadically fail.

Say for example you want to assert a spin lock

        mutex   dd 0

by setting it to 1 if it held 0 before. The simple way to do this is to busily wait for the lock to become zero, eg like this:

again:  mov     ebx, 1
        xchg    [mutex], ebx  ; try to claim mutex
        test    ebx, ebx      ; did we succeed?
        jnz     again         ; if not, try again

This loop is of course quite inefficient: if the lock is held by another thread, it spins very quickly, causing a lot of expensive RMW bus accesses. We can reduce the load using a pause instruction, but what would be even better was that once we knew the lock is held, we would only try to claim it once we knew that the other thread had released the lock.

The monitor / mwait instructions provide a facility to do this: you set up an address to be monitored and then get noticed when something interesting happened. Only then do you try to claim the lock, saving you from spinning if you know that you won't get it. It doesn't hurt if mwait returns early: you'll just fail to claim the lock and then go right back to waiting for it.

again:  mov     ebx, 1
        xchg    [mutex], ebx  ; try to claim mutex
        test    ebx, ebx      ; did we succeed?
        jz      gotit

        lea     rax, [mutex]  ; address to monitor
        xor     ecx, ecx      ; no extensions
        xor     edx, edx      ; no hints
        monitor               ; start to monitor the mutex
        xor     ecx, ecx      ; no extensions
        xor     eax, eax      ; no hints
        mwait                 ; wait for mutex to change
        jmp     again         ; once it changed, try to get the lock again

gotit:  ...

While this is good and all, there's a new problem: if the other thread doesn't yield for a while, sophisticated mutex implementations might want to switch to a different implementation, eg one where the kernel takes care of the lock. This permits waiting for the lock to become available without the thread having to actually run, freeing up resources for other users.

This is hard to achieve with monitor and mwait : the waiting period is indefinite and could be very long. AMD's mwaitx and monitorx instructions are very similar but fix this problem: they permit you to set a timeout after which mwaitx returns even if the memory region did not change. This way, you can use an algorithm like ”try to claim the lock 10 times by spinning and waiting, then escalate to a kernel-based lock” and be reasonably sure of the time frame it takes to execute.

How do you get around the ABA problem when using mwaitx?

Question

1 answers

solution1
3 ACCPTED 2023-01-11 02:22:10

How do you get around the ABA problem when using mwaitx?

Question

1 answers

solution1 3 ACCPTED 2023-01-11 02:22:10

solution1
3 ACCPTED 2023-01-11 02:22:10