简体   繁体   English

双打时在ARMv6上出现“总线错误”

[英]'Bus Error' on ARMv6 when working with doubles

I'm creating a C++ program for ARMv6 which crashes with BUS ERROR. 我正在为ARMv6创建一个C ++程序,该程序由于总线错误而崩溃。 Using GDB I have traced the problem to the following code 使用GDB,我已将问题追溯到以下代码

double d = *(double*)pData; pData += sizeof(int64_t);  // char *pData

The program goes through a received message and has to extract some double values using the above code. 该程序检查收到的消息,并必须使用上述代码提取一些双精度值。 The received message has several fields, some doubles some not. 接收到的消息有多个字段,有些则没有。

On x86 architectures this works fine, but on ARM I get the 'bus error'. 在x86架构上,此方法工作正常,但在ARM上,出现“总线错误”。 So, I suspect my problem is alignment of data -- the double fields have to be aligned to word boundaries in memory on the ARM architecture. 因此,我怀疑我的问题是数据的对齐方式-双字段必须与ARM体系结构中内存中的字边界对齐。

I have tried the following as a fix, which did not work (still got the error): 我已经尝试了以下解决方案,但仍无法解决(仍然出现错误):

int64_t i = *(int64_t*)pData;
double d = *((double*)&i);

The following worked (so far): 以下工作(到目前为止):

double d = 0;
memcpy(&d, pData, sizeof(double));

Is using 'memcpy' the best approach? 使用“ memcpy”是最好的方法吗? Or, is there a better way? 或者,还有更好的方法?

In my case I do not have control over the packing of the data in the buffer or the order of the fields in the message. 就我而言,我无法控制缓冲区中数据的打包或消息中字段的顺序。

Related question: std::atomic<double> on Armv7 (RPi2) and alignment/bus errors 相关问题: Armv7(RPi2)上的std :: atomic <double>和对齐/总线错误

Is using 'memcpy' the best approach? 使用“ memcpy”是最好的方法吗?

In general it's the only correct approach, unless you're targeting a single ABI in which no type requires greater than 1-byte alignment. 通常,这是唯一正确的方法,除非您针对的是单个ABI,其中任何类型都不需要大于1字节的对齐方式。

The C++ standard is rather verbose, so I'll quote the C standard expressing the same thing much more succinctly: C ++标准相当冗长,因此我将引用C标准来更简洁地表达同一内容:

A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. 指向对象或不完整类型的指针可以转换为指向不同对象或不完整类型的指针。 If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. 如果结果指针未针对指向的类型正确对齐,则该行为未定义。

There it is: that ever-present spectre of undefined behaviour. 就是这样:永远存在的不确定行为的幽灵。 Even an x86 compiler is perfectly well allowed to break into your house and rub jam into your hair while you sleep instead of loading that data the way you expect, if its ABI says so. 即使x86编译器也完全可以允许您在睡眠时闯入您的房间并在头发上擦果酱,而不是按照其期望的方式加载该数据(如果其ABI如此说的话)。

One thing to note, though, is that modern compilers tend to be clever enough that correctness doesn't necessarily come at the cost of performance. 但是要注意的一件事是,现代编译器往往足够聪明,以至于正确性并不一定以性能为代价。 Let's flesh out that example code: 让我们充实示例代码:

#include <string.h>

double func(char *data) {
    double d;
    memcpy(&d, data, sizeof d);
    return d;
}

...and throw it at a compiler: ...并把它扔给编译器:

$ clang -target arm -march=armv6 -mfpu=vfpv3 -mfloat-abi=hard -O1 -S test.c
...
func:                                   @ @func
        .fnstart
@ BB#0:
        push    {r4, r5, r11, lr}
        sub     sp, sp, #8
        mov     r2, r0
        ldrb    r1, [r0, #3]
        ldrb    r3, [r0, #2]
        ldrb    r12, [r0]
        ldrb    lr, [r0, #1]
        ldrb    r4, [r2, #4]!
        orr     r5, r3, r1, lsl #8
        ldrb    r3, [r2, #2]
        ldrb    r2, [r2, #3]
        ldrb    r0, [r0, #5]
        orr     r1, r12, lr, lsl #8
        orr     r2, r3, r2, lsl #8
        orr     r0, r4, r0, lsl #8
        orr     r1, r1, r5, lsl #16
        orr     r0, r0, r2, lsl #16
        str     r1, [sp]
        str     r0, [sp, #4]
        vpop    {d0}
        pop     {r4, r5, r11, pc}

OK, so it's playing things safe with a bytewise memcpy ; 好的,因此使用字节码memcpy安全地进行操作; at least it's inlined. 至少是内联的。 But hey, ARMv6 does at least support unaligned word and halfword accesses if the CPU is configured appropriately - let's tell the compiler we're cool with that: 但是,嘿,如果CPU配置正确,ARMv6至少支持不对齐的字和半字访问-让我们告诉编译器我们很酷:

$ clang -target arm -march=armv6 -mfpu=vfpv3 -mfloat-abi=hard -O1 -S -munaligned-access test.c
...
func:                                   @ @func
        .fnstart
@ BB#0:
        sub     sp, sp, #8
        ldr     r1, [r0]
        ldr     r0, [r0, #4]
        str     r0, [sp, #4]
        str     r1, [sp]
        vpop    {d0}
        bx      lr

There we go, that's about the best you can do with just integer word loads. 我们走了,那就是您只需要整数字加载就可以做到的最好的事情。 Now, what if we compile it for something a bit newer? 现在,如果我们将其编译为新的东西怎么办?

$ clang -target arm -march=armv7 -mfpu=neon-vfpv4 -mfloat-abi=hard -O1 -S test.c
...
func:                                   @ @func
        .fnstart
@ BB#0:
        vld1.8  {d0}, [r0]
        bx      lr

I can guarantee that, even on a machine where it would "work", no undefined-behaviour-hackery would correctly load that unaligned double in fewer than one instructions. 我可以保证,即使在一台可以“运行”的机器上,也不会有少于1条指令正确地加载未定义行为的黑客程序。 Note that NEON is the key player here - vld1 only requires the base address to be aligned to the element size, so for 8-bit elements it can never be unaligned. 请注意,NEON是此处的关键角色vld1仅要求将基地址与元素大小对齐,因此对于8位元素,它永远不能对齐。 In the more general case (say, if it were a long long instead of a double ) you might still need -munaligned-access to convince the compiler as before. 在更一般的情况下(例如,如果是long long而不是double ),您可能仍需要-munaligned-access来像以前一样说服编译器。

For comparison, let's just see how everyone's favourite mutant-grandchild-of-a-1970s-calculator-chip fares as well: 为了进行比较,让我们看一下每个人最喜欢的1970s突变子孙代计算器芯片的票价:

clang -O1 -S test.c
...
func:                                   # @func
# BB#0:
        movl    4(%esp), %eax
        fldl    (%eax)
        retl

Yup, the correct code still also looks like the best code. 是的,正确的代码仍然看起来像最好的代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM