简体   繁体   中英

What makes the difference between direct and indirect access to volatile objects in C?

I am dealing with hardware registers on an STM32 controller.

I have defined a bunch of structures like the following:

#define PACKED __attribute__ ((packed))
#define ASSERT(cond) _Static_assert(cond, #cond)

typedef union {
    struct PACKED {
        uint16_t value;
        uint16_t reserved;
    };
    uint32_t bits;
} LOL_LPTIM_16_BIT_VALUE;

ASSERT(sizeof(LOL_LPTIM_16_BIT_VALUE) == sizeof(uint32_t)); // OK

Then, I have a structure like this:

typedef struct PACKED {
    // ...
    volatile LOL_LPTIM_16_BIT_VALUE autoreload;
    // ...
} LOL_LPTIM;

The offset of the autoreload field in the structure agrees with the documentation. I also have the following objects available (following the documentation and header files provided by ST):

#define LOL_LPTIM1_BASE (LOL_APB1PERIPH_BASE + 0x7C00UL)
#define LOL_LPTIM2_BASE (LOL_APB1PERIPH_BASE + 0x9400UL)

#define LOL_LPTIM1 ((volatile LOL_LPTIM *) LOL_LPTIM1_BASE)
#define LOL_LPTIM2 ((volatile LOL_LPTIM *) LOL_LPTIM2_BASE)

I have a static const structure that stores these pointers:

static const struct {
    volatile LOL_LPTIM *lptim;
} timer[2] = {
    { .lptim = LPTIM1 },
    { .lptim = LPTIM2 }
}

Now, when I write

*(uint32_t *) &(timer[0].lptim->autoreload.bits) = 0xffff;

or

*(uint16_t *) &(timer[0].lptim->autoreload.value) = 0xffff;

the code works correctly, but when I write

timer[0].lptim->autoreload.bits = 0xffff;

(which should be exactly equivalent) or

timer[0].lptim->autoreload.value = 0xffff;

then it does not work as expected - it works differently than the indirected variant, and the value of the autoreload register doesn't seem to be set properly (the perpipherals behave differently).

What could be a possible reason of this discrepancy?

Godbolt shows that the compiler generates very different set of operations for these two cases: https://godbolt.org/z/cno9yf

They get more similar when the indirect version is changed to

*(volatile uint32_t *) &(timer[0].lptim->autoreload.bits) = 0xffff;

(there's many more instructions in the output)

The problem is packing the structures when it is not needed. You also overuse volatile.

Be very careful packing the structures and unions as it prevents many code optimisations. Do not use them "just in case".

Here you have the correct version.

https://godbolt.org/z/GNjmUX

When you do:

*(uint32_t *) &(timer[0].lptim->autoreload.bits) = 0xffff;

you cast away the volatile qualifer on the address, so this will be a non-volatile write of the field (so might be optimized away if there are later similar writes to the same address.) With

timer[0].lptim->autoreload.bits = 0xffff;

the access will be volatile, so cannot be so optimized.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM