简体   繁体   English

从一个数组中读取两个int

[英]Reading two ints from an array as a long

I am working on a microcontroller project in which I have an array of unsigned ints that comes in from a communications interface. 我正在一个微控制器项目中,其中有一个来自通信接口的无符号整数数组。 These are accessed through define macros for convenience. 为了方便起见,可以通过define宏访问它们。

I need to get sent some unsigned long values, instead of having to process two values from the comms register and shift them into a secondary long register, is it safe for me to use pointers and read two values out of the array at once. 我需要发送一些无符号的长值,而不是必须从comms寄存器中处理两个值并将它们移入辅助长寄存器中,对我而言,使用指针一次从数组中读取两个值是否安全?

I am interested in doing this as processing resources on the controller are quite limited. 我对此感兴趣,因为控制器上的处理资源非常有限。 Is this safe, will array values always be contiguous in memory? 这样安全吗,数组值在内存中总是连续的吗?

Example code 范例程式码

...

unsigned int comms[MAX_ADDRESS];

...

#define FOO             comms[0]
#define BAR             comms[1]
#define VAL_1           comms[2]
#define VAL_1_EXT       (*(unsigned long*)(&comms[2])) // Use pointer trickery to read a long
#define VAL_2           comms[4]
#define VAL_2_EXT       (*(unsigned long*)(&comms[4]))

...

Not sure if it is relevant but it is a chip from the MSP430 family from TI, compiler version TI 4.3.3 不确定是否相关,但它是TI MSP430系列的芯片,编译器版本为TI 4.3.3

It depends what you mean by "safe." 这取决于您所说的“安全”。 It's absolutely unsafe in the sense that the C Standard says nothing about what will happen because you are aliasing types with pointer casts. 从某种意义上说,C标准对事件将不会发生什么是绝对不安全的,因为您使用指针强制转换来别名化类型。 This is non-portable. 这是不可携带的。

But non-portable doesn't mean non-functional. 但是,非便携式并不意味着非功能性。 If the code is not for production use and you have good control over the development environment, you're likely to do fine with your proposal. 如果该代码不是供生产使用的,并且您对开发环境有很好的控制,则您的建议很可能会很好。 The C Standard does guarantee that array elements are contiguous. C标准确实保证数组元素是连续的。 If the compiler generates code that fetches the two (I'm guessing) 16-bit quantities from the commo registers to correctly form a 32-bit long in one instance, then it is virtually certain that: 如果编译器生成的代码可以从commo寄存器中提取两个(我想是)16位数据量,以在一个实例中正确形成32位长,那么实际上可以确定:

  • It will do so in all usages. 它将在所有用法中都这样做。

  • Future compiler versions will do the same. 将来的编译器版本将执行相同的操作。

There are no guarantees, but in practice it's a reasonable bet. 没有保证,但是实际上这是一个合理的选择。

To learn whether the code you're getting is correct, compile with -S and inspect. 要了解所获取的代码是否正确,请使用-S编译并检查。 Write a good test to verify. 写一个好的测试来验证。

At any rate you have taken a good approach by isolating the access code in macros (though you should drop the semi-colons at the ends). 无论如何,您已经通过隔离宏中的访问代码采取了一种好的方法(尽管您应该在末尾删除分号)。

The following macro is well-defined with respect to the C Standard. 下面的宏相对于C标准定义明确的。

#define VAL_1_EXT       (((unsigned long)comms[3] << 16) | (unsigned long)comms[2])

If the you wrote 如果你写

unsigned long x = VAL_1_EXT;

a good optimizing compiler should generate much the same code with the macro above as with your proposed one. 一个好的优化编译器应该使用上面的宏生成与您提议的宏相同的代码。 I guess you're saying it's not a good optimizing compiler. 我猜您是在说它不是一个好的优化编译器。

As pointed out in comments, this macro is not an l-value. 正如注释中指出的那样,该宏不是l值。 You can't assign to it. 您不能分配它。 For that you'll need a separate macro. 为此,您需要一个单独的宏。

#define SET_VAL_1_EXT(Val) do { \
  unsigned long x = (unsigned long)Val;
  comms[2] = x; \
  comms[3] = (unsigned)(x >> 16); \
} while (0)

According to the standard, you have an aliasing bug, anything may happen. 根据标准,您有一个别名错误,任何事情都可能发生。

The compiler is allowed to assume there is no aliasing between 16-bit int and 32-bit long types, and you might get surprising behavior (without warning) because you break that contract. 允许编译器假定16位int和32位long类型之间没有别名,并且您可能会得到令人惊讶的行为(无警告),因为您违反了该约定。

Just say no, use bit-shifting to compose your long from the two int s, and depend on the compiler to optimize that out for you (It should not really use bit-shifting under-the-hood). 只需说不,就可以使用移位将两个int组成的long组成,并依靠编译器为您优化这一点(它实际上不应在后台使用移位)。 You might want to look at the assembly to determine whether it fails. 您可能需要查看程序集以确定它是否失败。

6.5 Expressions § 7 6.5表达式§7

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:88) 一个对象只能通过具有以下类型之一的左值表达式访问其存储值:88)
— a type compatible with the effective type of the object, —与对象的有效类型兼容的类型,
— a qualified version of a type compatible with the effective type of the object, —与对象的有效类型兼容的类型的限定版本,
— a type that is the signed or unsigned type corresponding to the effective type of the object, —一个类型,它是与对象的有效类型相对应的有符号或无符号类型,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, —一种类型,是与对象的有效类型的限定版本相对应的有符号或无符号类型,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or —集合或联合类型,其成员中包括上述类型之一(递归地包括子集合或包含的联合的成员),或
— a character type. —字符类型。

As int and long are not compatible, and there is no exception, aliasing them is forbidden. 由于intlong不兼容,并且也不例外,因此禁止别名。

The more modern (and the better at optimizing) your compiler is, the more likely it is playing loose will bite you. 您的编译器越现代(并且越擅长优化),则其发挥作用的可能性就越大。

BTW: Most compilers implement many dialects, and GCC allows disabling of strict aliasing with -fno-strict-aliasing . 顺便说一句:大多数编译器实现许多方言,并且GCC允许使用-fno-strict-aliasing禁用-fno-strict-aliasing Be sure not just to disable the warning but the actual optimizations. 确保不仅禁用警告,而且要进行实际的优化。

If you wish to do this, are confident that sizeof(int)*2==sizeof(long) on your platform, and are content with this non-portability (because this assumption is non-portable) you can (and should) use a union to move back and forth between the two types in a defined manner. 如果您希望这样做,请确信平台上的sizeof(int)*2==sizeof(long) ,并且对这种不可移植性感到满意(因为这种假设不可移植),您可以(并且应该)使用以定义的方式在两种类型之间来回移动的联合。

union {
    int in [2];
    long out;
};

You may either store elements of this union type in your array, and write int s to in and read long s from out , or you can place int s from an int array into the union, and the read them out two at a time as a long . 您可以将这种并集类型的元素存储在数组中,然后将int s写入in并从out读取long s,也可以将int s从int数组中放入union中,并一次读取两个long

Note that if you want more portability, you can use the integer types from <stdint.h> : 请注意,如果要提高可移植性,可以使用<stdint.h>的整数类型:

union {
    int32_t in [2];
    int64_t out;
};

Then the only platform-dependent behaviours will be: 然后,唯一依赖于平台的行为将是:

  • How signed integers are represented 如何表示有符号整数
  • Endianness 字节序

Yes, this is safe, with the following assumptions: 是的,基于以下假设,这是安全的:

  • The sender of this data is sending data as you're expecting. 此数据的发送者正在按预期发送数据。 For example, comms[2] and comms[3] together do actually make up an unsigned long value, as you expect. 例如, comms[2]comms[3]实际上确实组成了一个unsigned long值,正如您所期望的那样。

  • The sender's bit order (known as endianness ) and byte order are what you're expecting. 发件人的位顺序(称为endianness )和字节顺序是您所期望的。

Per the subsequent comment on the question, the answer is no. 根据对问题的后续评论,答案是否定的。 My original answer explains why. 我的原始答案解释了原因。


It depends on the whether you want completely safe and portable code, or are OK with code for a specific architecture, as well as on the endianess and order of the int s. 这取决于您是否想要完全安全且可移植的代码,或者是否适合特定体系结构的代码以及int的字节序和顺序。

If you are OK with specific code, then... 如果您可以使用特定代码,则...

Arrays in C are always consecutive memory locations and always packed, and a lot of code depends on this. C中的数组始终是连续的内存位置,并且始终打包,并且很多代码都依赖于此。

On a big endian system, if you have int s in the order 在大端序系统上,如果您有int顺序

high-int,low-int

each int is 每个int

high-byte,low-byte

and the bytes in memory are 并且内存中的字节是

high-int-high,high-int-low,low-int-high,low-int-low

which you can then deference using a (long int*) cast. 然后可以使用(long int*)强制转换。 But not on a little endian system. 但不是在小端系统上。

On a little endian system, if you have int s in the order 在小端系统上,如果您有int顺序

low-int,high-int

each int is 每个int

low-byte,high-byte

the bytes in memory are 内存中的字节是

low-int-low,low-int-high,high-int-low,high-int-high

which you can then deference using a (long int*) cast. 然后可以使用(long int*)强制转换。 But not on a big endian system. 但不是在大型字节序系统上。

I believe casting the unsigned int pointer to a unsigned long pointer will work on the MSP430 because the MSP430 is little endian AND the MSP430 does not require 4-byte longs to be aligned on 4-byte boundaries. 我相信将无符号的int指针强制转换为无符号的长指针将在MSP430上起作用,因为MSP430是低位字节序的并且MSP430不需要4字节长在4字节边界上对齐。 But don't count on this working on another platform. 但是不要指望在其他平台上可以正常工作。

And don't expect that you can also cast two consecutive bytes to an unsigned int. 并且不要期望您也可以将两个连续的字节转换为一个无符号的int。 The MSP430 requires that 2-byte words must be aligned on an even address. MSP430要求2字节字必须在偶数地址上对齐。 So if the first byte happens to be at an odd address then you will get undefined behavior when you cast it to a word. 因此,如果第一个字节恰好位于奇数地址,则将其转换为一个单词时,您将获得不确定的行为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM