Why doesn't my ARM LDREX/STREX C function work?

Question

I wrote a claim_lock function in C, according to the "Barrier Litmus Tests and Cookbook" document. I examined the generated code, and it all looks good, but it didn't work.

// This code conforms to the section 7.2 of PRD03-GENC-007826:
// "Acquiring and Releasing a Lock"
static inline void claim_lock( uint32_t volatile *lock )
{
  uint32_t failed = 1;
  uint32_t value;

  while (failed) {
    asm volatile ( "ldrex %[value], [%[lock]]"
                   : [value] "=&r" (value)
                   : [lock] "r" (lock) );
    if (value == 0) {
      // The failed and lock registers are not allowed to be the same, so
      // pretend to gcc that the lock pointer may be written as well as read.

      asm volatile ( "strex %[failed], %[value], [%[lock]]"
                     : [failed] "=&r" (failed)
                     , [lock] "+r" (lock)
                     : [value] "r" (1) );
    }
    else {
      asm ( "clrex" );
    }
  }
  asm ( "dmb sy" );
}

Generated code (gcc):

1000:       e3a03001        mov     r3, #1
1004:       e1902f9f        ldrex   r2, [r0]
1008:       e3520000        cmp     r2, #0
100c:       1a000004        bne     1024 <claim_lock+0x24>
1010:       e1802f93        strex   r2, r3, [r0]
1014:       e3520000        cmp     r2, #0
1018:       1afffff9        bne     1004 <claim_lock+0x4>
101c:       f57ff05f        dmb     sy
1020:       e12fff1e        bx      lr
1024:       f57ff01f        clrex
1028:       eafffff5        b       1004 <claim_lock+0x4>

Corresponding release function:

static inline void release_lock( uint32_t volatile *lock )
{
  // Ensure that any changes made while holding the lock are
  // visible before the lock is seen to have been released
  asm ( "dmb sy" );
  *lock = 0;
}

It worked in QEMU, but either hung, or allowed all cores to "claim" the so-called "lock" on real hardware (Raspberry Pi 3 Cortex-A53).

Answer 1

this is what i found in Context switch section of ARMv7-M Architecture Reference Manual

Blockquote

It is necessary to ensure that the local monitor is in the Open Access state after a context switch. In ARMv7-M, the local monitor is changed to Open Access automatically as part of an exception entry or exit sequence. The local monitor can also be forced to the Open Access state by a CLREX instruction. Note Context switching is not an application level operation. However, this information is included here to complete the description of the exclusive operations. A context switch might cause a subsequent Store-Exclusive to fail, requiring a load … store sequence to be replayed. To minimize the possibility of this happening, ARM recommends that the Store-Exclusive instruction is kept as close as possible to the associated Load-Exclusive instruction, see Load-Exclusive and Store-Exclusive usage restrictions.

Blockquote

Answer 2

The LDREX instruction will hang the core (unless my test failed to report an exception) if:

The MMU is not enabled
The virtual memory area containing the lock is not cached

The cores will appear to ignore each other's claims if:

Symmetric Multi-processing has not been enabled

The SMP enable mechanism seems to vary from device to device; check the TRM for the partular core, it's outside the scope of the ARM ARM.

For the Cortex-A53, the bit to set is SMPEN, bit 6 of The CPU Extended Control Register, CPUECTLR.

Earlier devices have bit 5 of the Auxiliary Control Register, for example (ARM11 MPcore), where there's also the SCU to consider. I don't have such a device, but it's that documentation where I first noticed an SMP/nAMP bit.

Why doesn't my ARM LDREX/STREX C function work?

Question

2 answers

solution1
0 2022-05-15 04:29:28

solution2
-1 ACCPTED 2021-07-05 23:01:25

Why doesn't my ARM LDREX/STREX C function work?

Question

2 answers

solution1 0 2022-05-15 04:29:28

solution2 -1 ACCPTED 2021-07-05 23:01:25

solution1
0 2022-05-15 04:29:28

solution2
-1 ACCPTED 2021-07-05 23:01:25