简体   繁体   English

避免 volatile 位域赋值表达式多次读写 memory

[英]Avoid volatile bit-field assignment expression reading or writing memory several times

I want to use volatile bit-field struct to set hardware register like following code我想使用 volatile 位域结构来设置硬件寄存器,如下面的代码

union foo {
    uint32_t value;
    struct {
        uint32_t x : 1;
        uint32_t y : 3;
        uint32_t z : 28;
    };
};
union foo f = {0};
int main()
{
    volatile union foo *f_ptr = &f;
    //union foo tmp;
    *f_ptr =  (union foo) {
        .x = 1,
        .y = 7,
        .z = 10,
    };
    //*f_ptr = tmp;
    return 0;
}

However, the compiler will make it to STR , LDR HW register several times.但是,编译器会将其多次写入STRLDR HW 寄存器。 It is a terrible things that it will trigger hardware to work at once when the register is writed.写寄存器时它会立即触发硬件工作,这是一件可怕的事情。

main:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    movw    r3, #:lower16:.LANCHOR0
    movs    r0, #0
    movt    r3, #:upper16:.LANCHOR0
    ldr r2, [r3]
    orr r2, r2, #1
    str r2, [r3]
    ldr r2, [r3]
    orr r2, r2, #14
    str r2, [r3]
    ldr r2, [r3]
    and r2, r2, #15
    orr r2, r2, #160
    str r2, [r3]
    bx  lr
    .size   main, .-main
    .global f
    .bss
    .align  2

My gcc version is: arm-linux-gnueabi-gcc (Linaro GCC 4.9-2017.01) 4.9.4 and build with -O2 optimation我的 gcc 版本是: arm-linux-gnueabi-gcc (Linaro GCC 4.9-2017.01) 4.9.4并使用 -O2 优化构建


I have tried to use the local variable to resolve this problem我试图使用局部变量来解决这个问题

union foo {
    uint32_t value;
    struct {
        uint32_t x : 1;
        uint32_t y : 3;
        uint32_t z : 28;
    };
};
union foo f = {0};
int main()
{
    volatile union foo *f_ptr = &f;
    union foo tmp;
    tmp =  (union foo) {
        .x = 1,
        .y = 7,
        .z = 10,
    };
    *f_ptr = tmp;
    return 0;
}

Well, it will not STR to HW register several times好吧,它不会STR到 HW 注册几次

main:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    movs    r1, #10
    movs    r2, #15
    movw    r3, #:lower16:.LANCHOR0
    bfi r2, r1, #4, #28
    movt    r3, #:upper16:.LANCHOR0
    movs    r0, #0
    str r2, [r3]
    bx  lr
    .size   main, .-main
    .global f
    .bss
    .align  2

I think it is still not a good idea to use local variable, considering the limitation of binary size for embedded system.考虑到嵌入式系统二进制大小的限制,我认为使用局部变量仍然不是一个好主意。

Is there any way to handle this problem without using local variable?有没有办法在不使用局部变量的情况下处理这个问题?


I think this is a bug in GCC.我认为这是 GCC 中的一个错误。 Per discussion below, you might consider using:根据下面的讨论,您可以考虑使用:

f_ptr->value =  (union foo) {
        .x = 1,
        .y = 7,
        .z = 10,
    } .value;

By the C standard, the code a compiler generates for a program may not access a volatile object when the original C code nominally does not access the object. By the C standard, the code a compiler generates for a program may not access a volatile object when the original C code nominally does not access the object. The code *f_ptr = (union foo) {.x = 1, .y = 7, .z = 10, };代码*f_ptr = (union foo) {.x = 1, .y = 7, .z = 10, }; is a single assignment to *f_ptr .是对*f_ptr的单个分配。 So we would expect this to generate a single store to *f_ptr ;所以我们希望这会生成一个存储到*f_ptr generating two stores is a violation of the standard's requirements.生成两个商店违反了标准的要求。

We could consider an explanation for this to be that GCC is treating the aggregate (the union and/or the structure within it) as several objects, each individually volatile, rather than one aggregated volatile object.我们可以认为对此的解释是 GCC 将聚合(联合和/或其中的结构)视为多个对象,每个对象都是易失的,而不是一个聚合的易失 object。 1 But, if this were so, then it ought to generate separate 16-bit strh instructions for the parts (per the original example code, which had 16-bit parts), not the 32-bit str instructions we see. 1但是,如果是这样,那么它应该为这些部分生成单独的 16 位strh指令(根据具有 16 位部分的原始示例代码),而不是我们看到的 32 位str指令。

While using a local variable appears to work around the issue, I would not rely on that, because the assignment of the compound literal above is semantically equivalent, so the cause of why GCC generates broken assembly code for one sequence of code and not the other is unclear.虽然使用局部变量似乎可以解决这个问题,但我不会依赖它,因为上面的复合文字的赋值在语义上是等效的,所以 GCC 为一个代码序列而不是另一个代码序列生成损坏的汇编代码的原因不清楚。 With different circumstances (such as additional or modified code in the function or other variations that might affect optimization), GCC might generate broken code with the local variable too.在不同的情况下(例如 function 中的附加或修改代码或其他可能影响优化的变体),GCC 也可能使用局部变量生成损坏的代码。

What I would do is avoid using an aggregate for the volatile object.我要做的是避免对易失性 object 使用聚合。 The hardware register is, presumably, physically more like a 32-bit unsigned integer than like a structure of bit-fields (even though semantically it is defined with bit-fields).据推测,硬件寄存器在物理上更像是一个 32 位无符号 integer,而不是位域结构(尽管在语义上它是用位域定义的)。 So I would define the register as volatile uint32_t and use that type when assigning values to it.因此,我将寄存器定义为volatile uint32_t并在为其赋值时使用该类型。 Those values could be prepared with bit shifts or structures with bit-fields or whatever other method you prefer.这些值可以通过位移或具有位域的结构或您喜欢的任何其他方法来准备。

It should not be necessary to avoid using local variables, as the optimizer should effectively eliminate them.不必避免使用局部变量,因为优化器应该有效地消除它们。 However, if you wish to neither change the register definition nor use local variables, an alternative is the code I opened with:但是,如果您既不希望更改寄存器定义也不希望使用局部变量,另一种方法是我打开的代码:

f_ptr->value =  (union foo) {
        .x = 1,
        .y = 7,
        .z = 10,
    } .value;

That prepares the value to be stored but then assigns it using the uint32_t member of the union rather than using the whole union, and testing with ARM GCC 4.6.4 on Compiler Explorer (the closest match I could find on Compiler Explorer to what you are using) suggests it generates a single store with minimal code:这准备了要存储的值,然后使用联合的uint32_t成员而不是使用整个联合来分配它,并在编译器资源管理器上使用 ARM GCC 4.6.4 进行测试(我可以在编译器资源管理器上找到最接近的匹配项) using) 建议它使用最少的代码生成单个商店:

main:
        ldr     r3, .L2
        mov     r2, #175
        str     r2, [r3, #0]
        mov     r0, #0
        bx      lr
.L2:
        .word   .LANCHOR0
.LANCHOR0 = . + 0
f:

Footnote脚注

1 I would consider this to a bug too, as the C standard does not make provision for applying the volatile qualifier on a union or structure declaration as applying to the members rather than to the whole aggregate. 1我也认为这是一个错误,因为 C 标准没有规定将volatile限定符应用于联合或结构声明作为应用于成员而不是整个聚合。 For arrays, it does say that qualifiers apply to the elements, rather than the whole array (C 2018 6.7.3 10).对于 arrays,它确实说限定符适用于元素,而不是整个数组(C 2018 6.7.3 10)。 It has no such wording for unions or structures.对于工会或结构,它没有这样的措辞。

You can force the aggregate union to be written in one go with您可以强制将聚合联合写入一个 go 与

f_ptr->value = (union foo) {
    .x = 10,
    .y = 20,
}.value;

// produced asm
mov     r1, #10
orr     r1, r1, #1310720
str     r1, [r0]
bx      lr

There seems to be no need for bitfields in your program: using uint16_t types should make it simpler and generate better code:您的程序中似乎不需要位域:使用uint16_t类型应该使其更简单并生成更好的代码:

#include <stdint.h>

union foo {
    uint32_t value;
    struct {
        uint16_t x;
        uint16_t y;
    };
};
union foo f = { 0 };

int main() {
    volatile union foo *f_ptr = &f;
    *f_ptr = (union foo) {
        .x = 10,
        .y = 20,
    };
    return 0;
}

Code generated by arm gcc 4.6.4 linux , as produced by Godbolt Compiler Explorer: arm gcc 4.6.4 linux生成的代码,由 Godbolt 编译器资源管理器生成:

main:
        ldr     r3, .L2
        mov     r0, #0
        mov     r2, #10
        str     r0, [r3, #0]
        strh    r2, [r3, #0]    @ movhi
        mov     r2, #20
        strh    r2, [r3, #2]    @ movhi
        bx      lr
.L2:
        .word   .LANCHOR0
.LANCHOR0 = . + 0
f:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM