简体   繁体   English

如何在嵌入式系统中安全地执行类型校正

[英]How to safely perform type-punning in embedded system

Our team is currently using some ported code from an old architecture to a new product based on the ARM Cortex M3 platform using a customized version of GCC 4.5.1. 我们的团队目前正在使用一些旧版本的移植代码,这些版本的代码使用GCC 4.5.1的自定义版本,基于ARM Cortex M3平台从旧体系结构移植到新产品。 We are reading data from a communications link, and attempting to cast the raw byte array to a struct to cleanly parse the data. 我们正在从通信链接中读取数据,并尝试将原始字节数组转换为结构以干净地解析数据。 After casting the pointer to a struct and dereferencing, we are getting a warning: "dereferencing type-punned pointer will break strict-aliasing rules". 将指针强制转换为结构并取消引用后,我们将收到警告:“取消引用类型化指针将破坏严格的别名规则”。

After some research, I've realized that since the char array has no alignment rules and the struct have to be word aligned, casting the pointers causes undefined behavior (a Bad Thing). 经过一番研究,我意识到由于char数组没有对齐规则,并且结构必须是单词对齐的,因此强制转换指针会导致未定义的行为(不好的事情)。 I'm wondering if there is a better way to do what we're trying. 我想知道是否有更好的方法来做我们正在尝试的事情。

I know we can explicitly word-align the char array using GCC's " attribute ((aligned (4)))". 我知道我们可以使用GCC的“ 属性 ((aligned(4)))”对char数组进行显式字对齐。 I believe this will make our code "safer", but the warnings will still clutter up our builds, and I don't want to disable the warnings in case this situation arises again. 我相信这将使我们的代码“更安全”,但是警告仍然会使我们的构建混乱,并且我不想禁用警告,以防再次出现这种情况。 What we want is a way to safely do what we are trying, that will still inform us if we attempt to do something unsafe in another place later. 我们想要的是一种安全地做我们正在尝试的方法的方法,如果以后我们尝试在另一个地方进行不安全的操作,它仍然会通知我们。 Since this is an embedded system, RAM usage and flash usage are important to some degree. 由于这是一个嵌入式系统,因此在一定程度上重要的是RAM使用率和闪存使用率。

Portability (compiler and architecture) is not a huge concern, this is just for one product. 可移植性(编译器和体系结构)并不是一个大问题,这只是针对一种产品。 However, if a portable solution exists, it would be preferred. 但是,如果存在便携式解决方案,则将是首选。

Here is the a (very simplified) example of what we are currently doing: 这是我们当前正在做的一个(非常简化的)示例:

#define MESSAGE_TYPE_A 0
#define MESSAGE_TYPE_B 1

typedef struct MessageA __attribute__((__packed__))
{
    unsigned char  messageType;
    unsigned short data1;
    unsigned int   data2;
}

typedef struct MessageB __attribute__((__packed__))
{
    unsigned char  messageType;
    unsigned char  data3;
    unsigned char  data4;
}


// This gets filled by the comm system, assume from a UART interrupt or similar
unsigned char data[100];


// Assume this gets called once we receive a full message
void ProcessMessage()
{
    MessageA* messageA;
    unsigned char messageType = data[0];

    if (messageType == MESSAGE_TYPE_A)
    {
        // Cast data to struct and attempt to read
        messageA = (MessageA*)data; // Not safe since data may not be word aligned
                                    // This may cause undefined behavior

        if (messageA->data1 == 4) // warning would be here, when we use the data at the pointer
        {
            // Perform some action...
        }
    }
    // ...
    // process different types of messages
}

As has already been pointed out, casting pointers about is a dodgy practice. 正如已经指出的那样,抛弃指针是一种狡猾的做法。

Solution: use a union 解决方案:使用工会

struct message {
  unsigned char messageType;
  union {
    struct {
      int data1;
      short data2;
    } A;
    struct {
      char data1[5];
      int data2;
    } B;
  } data;
};

void func (...) {
  struct message msg;
  getMessage (&msg);

  switch (msg.messageType) {
    case TYPEA:
      doStuff (msg.data.A.data1);
      break;
    case TYPEB:
      doOtherStuff (msg.data.B.data1);
      break;
  }
}

By this means the compiler knows you're accessing the same data via different means, and the warnings and Bad Things will go away. 通过这种方式,编译器知道您正在通过不同的方式访问相同的数据,并且警告和不良信息将消失。

Of coure, you'll need to make sure the structure alignment and packing matches your message format. 当然,您需要确保结构对齐和打包与您的消息格式匹配。 Beware endian issues and such if the machine on the other end of the link doesn't match. 请注意字节序问题,如果链接另一端的机器不匹配,则应注意此类问题。

Type punning through cast of types different than char * or a pointer to a signed/unsigned variant of char is not strictly conforming as it violates C aliasing rules (and sometimes alignment rules if no care is given). 通过键入类型不同于投夯实char *或指向的符号/无符号的变体char不严格符合,因为它违反了Ç别名规则(有时走线规则。如果没有护理给出)。

However, gcc permits type punning through union types. 但是, gcc允许通过联合类型进行类型修剪。 Manpage of gcc explicitly documents it: gcc联机帮助页明确记录了它:

The practice of reading from a different union member than the one most recently written to (called "type-punning") is common. 从与最近写过的工会成员不同的工会成员那里进行阅读的做法很常见(称为“类型操纵”)。 Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. 即使使用-fstrict-aliasing,只要通过联合类型访问内存,也可以进行类型修剪。

To disable optimizations related to aliasing rules with gcc (and thus allow the program to break C aliasing rules), the program can be compiled with: -fno-strict-aliasing . 要使用gcc禁用与别名规则相关的优化(从而允许程序破坏C别名规则),可以使用-fno-strict-aliasing编译程序。 Note that with this option enabled, the program is no longer strictly conforming, but you said portability is not a concern. 请注意,启用此选项后,程序不再严格符合要求,但是您说的是可移植性。 For information, the Linux kernel is compiled with this option. 有关信息,Linux内核使用此选项进行编译。

GCC has a -fno-strict-aliasing flag that will disable strict-aliasing-based optimizations and make your code safe. GCC具有-fno-strict-aliasing标志,该标志将禁用基于严格混叠的优化,并使您的代码安全。

If you're really looking for a way to "fix" it, you have to rethink the way your code works. 如果您确实在寻找一种“修复”它的方法,则必须重新考虑代码的工作方式。 You can't just overlay the structure the way you're trying, so you need to do something like this: 您不能仅以尝试的方式覆盖结构,因此您需要执行以下操作:

MessageA messageA;
messageA.messageType = data[0];
// Watch out - endianness and `sizeof(short)` dependent!
messageA.data1 = (data[1] << 8) + data[2];
// Watch out - endianness and `sizeof(int)` dependent!
messageA.data2 = (data[3] << 24) + (data[4] << 16)
               + (data[5] <<  8) + data[6];

This method will let you avoid packing your structure, which might also improve its performance characteristics elsewhere in your code. 使用此方法可以避免打包结构,这也可以改善代码其他位置的性能特征。 Alternately: 交替:

MessageA messageA;
memcpy(&messageA, data, sizeof messageA);

Will do it with your packed structures. 将使用您的打包结构来完成。 You would do the reverse operations to translate the structures back into a flat buffer if necessary. 如果需要,您可以执行反向操作将结构转换回平面缓冲区。

Stop using packed structures and memcpy the individual fields into variables of the correct size and type. 停止使用包装结构和memcpy各个字段到正确的尺寸和类型的变量。 This is the safe, portable, clean way to do what you're trying to achieve. 这是安全,轻便,干净的方法来完成您要实现的目标。 If you're lucky, gcc will optimize the tiny fixed-size memcpy into a few simple load and store instructions. 如果幸运的话,gcc会将小型固定大小的memcpy优化为一些简单的加载和存储指令。

The Cortex M3 can handle unaligned accesses just fine. Cortex M3可以很好地处理未对齐的访问。 I have done this in similar packet processing systems with the M3. 我已经在与M3类似的数据包处理系统中做到了这一点。 You don't need to do anything, you can just use the flag -fno-strict-aliasing to get rid of the warning. 您无需执行任何操作,只需使用标志-fno-strict-aliasing即可消除警告。

对于未对齐的访问,请查看linux宏get_unaligned / put_unaligned。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM