简体   繁体   English

无符号整数中的填充位和C89中的按位运算

[英]Padding bits in unsigned integers and bitwise operations in C89

I have a lot of code that performs bitwise operations on unsigned integers. 我有很多代码对无符号整数执行按位运算。 I wrote my code with the assumption that those operations were on integers of fixed width without any padding bits. 我编写了我的代码,假设这些操作是在固定宽度的整数上,没有任何填充位。 For example an array of 32-bit unsigned integers of which all 32 bits available for each integer. 例如,一个32位无符号整数数组,其中每个整数都有32位可用。

I'm looking to make my code more portable and I'm focused on making sure I'm C89 compliant (in this case). 我希望让我的代码更具可移植性,并且我专注于确保我符合C89 (在这种情况下)。 One of the issues that I've come across is possible padded integers. 我遇到的一个问题是填充整数。 Take this extreme example, taken from the GMP manual : 拿这个极端的例子,取自GMP手册

However on Cray vector systems it may be noted that short and int are always stored in 8 bytes (and with sizeof indicating that) but use only 32 or 46 bits. 然而,在Cray矢量系统上,可以注意到short和int总是以8个字节存储(并且sizeof指示),但仅使用32或46位。 The nails feature can account for this, by passing for instance 8*sizeof(int)-INT_BIT . 指甲功能可以通过传递例如8*sizeof(int)-INT_BIT

I've also read about this type of padding in other places. 我也在其他地方读过这种类型的填充。 I actually read of a post on SO last night (forgive me, I don't have the link and I'm going to cite something similar from memory) where if you have, say, a double with 60 usable bits the other 4 could be used for padding and those padding bits could serve some internal purpose so they cannot be modified. 我昨晚真的在SO上看了一篇文章(请原谅我,我没有链接,我要引用类似记忆的东西),如果你有一个带有60个可用位的双重,另外4个可以用于填充和那些填充位可以用于某些内部目的,因此它们不能被修改。


So let's say for example my code is compiled on a platform where an unsigned int type is sized at 4 bytes, each byte being 8 bits, however the most significant 2 bits are padding bits. 因此,假设我的代码是在一个平台上编译的,其中unsigned int类型的大小为4个字节,每个字节为8位,但最重要的2位是填充位。 Would UINT_MAX in that case be 0x3FFFFFFF (1073741823)? UINT_MAX在这种情况下是0x3FFFFFFF(1073741823)?

#include <stdio.h>
#include <stdlib.h>

/* padding bits represented by underscores */
int main( int argc, char **argv )
{
    unsigned int a = 0x2AAAAAAA; /* __101010101010101010101010101010 */
    unsigned int b = 0x15555555; /* __010101010101010101010101010101 */
    unsigned int c = a ^ b; /* ?? __111111111111111111111111111111 */
    unsigned int d = c << 5; /* ??  __111111111111111111111111100000 */
    unsigned int e = d >> 5; /* ?? __000001111111111111111111111111 */

    printf( "a: %X\nb: %X\nc: %X\nd: %X\ne: %X\n", a, b, c, d, e );
    return 0;
}
  • Is it safe to XOR two integers with padding bits? 使用填充位XOR两个整数是否安全?
  • Wouldn't I XOR whatever the padding bits are? 不管填充位是什么,我不会XOR吗?

I can't find this behavior covered in C89. 我找不到C89中涵盖的这种行为。

Furthermore is the c variable guaranteed to be 0x3FFFFFFF or if for example the two padding bits were both on in a or b would c be 0xFFFFFFFF ? 此外在c变量保证是0x3FFFFFFF或如果例如两个填充比特在一个都是上或b将c0xFFFFFFFF

Same question with d and e . de相同的问题。 Am I manipulating the padding bits by shifting? 我是通过移动来操纵填充位吗? I would expect to see this below, assuming 32 bits with the 2 most significant bits used for padding, but I want to know if something like this is guaranteed: 我希望在下面看到这一点,假设32位,其中2位最高位用于填充,但我想知道这样的事情是否有保证:

a: 2AAAAAAA
b: 15555555
c: 3FFFFFFF
d: 3FFFFFE0
e: 01FFFFFF

Also are padding bits always the most significant bits or could they be the least significant bits? 填充位总是最高位还是最低位?


EDIT 12/19/2010 5PM EST : Christoph has answered my question. 编辑12/19/2010美国东部时间下午5点 :Christoph回答了我的问题。 Thanks! 谢谢!
I had also asked (above) whether padding bits are always the most significant bits. 我还问过(上面)填充位是否总是最重要的位。 This is cited in the rationale for the C99 standard, and the answer is no. 这在C99标准的基本原理中引用,答案是否定的。 I am playing it safe and assuming the same for C89. 我正在玩它安全并假设C89相同。 Here is specifically what the C99 rationale says for §6.2.6.2 (Representation of Integer Types): 以下是C99基本原理对§6.2.6.2(整数类型表示)的说法:

Padding bits are user-accessible in an unsigned integer type. 填充位是用户可访问的无符号整数类型。 For example, suppose a machine uses a pair of 16-bit shorts (each with its own sign bit) to make up a 32-bit int and the sign bit of the lower short is ignored when used in this 32-bit int. 例如,假设一台机器使用一对16位短路(每个都有自己的符号位)来构成一个32位的int,而在这个32位的int中使用时,忽略了short short的符号位。 Then, as a 32-bit signed int, there is a padding bit (in the middle of the 32 bits) that is ignored in determining the value of the 32-bit signed int. 然后,作为32位有符号整数,在确定32位有符号int的值时会忽略一个填充位(在32位的中间)。 But, if this 32-bit item is treated as a 32-bit unsigned int, then that padding bit is visible to the user's program. 但是,如果将此32位项目视为32位无符号整数,则该填充位对用户程序可见。 The C committee was told that there is a machine that works this way, and that is one reason that padding bits were added to C99. C委员会被告知有一台机器以这种方式工作,这就是填充位被添加到C99的一个原因。

Footnotes 44 and 45 mention that parity bits might be padding bits. 脚注44和45提到奇偶校验位可能是填充位。 The committee does not know of any machines with user-accessible parity bits within an integer. 委员会不知道任何具有用户可访问的奇偶校验位的机器在整数内。 Therefore, the committee is not aware of any machines that treat parity bits as padding bits. 因此,委员会不知道任何将奇偶校验位视为填充位的机器。


EDIT 12/28/2010 3PM EST : I found an interesting discussion on comp.lang.c from a few months ago. 编辑12/28/2010美国东部时间下午3点 :几个月前我在comp.lang.c上发现了一个有趣的讨论。

One point made by Dietmar which I found interesting: Dietmar提出的一点我感兴趣:

Let's note that padding bits are not necessary for the existence of trap representations; 让我们注意,填充位不是存在陷阱表示所必需的; combinations of value bits which do not represent a value of the object type would also do. 不表示对象类型值的值位组合也可以。

Bitwise operations (like arithmetic operations) operate on values and ignore padding. 按位运算(如算术运算)对值进行操作并忽略填充。 The implementation may or may not modify padding bits (or use them internally, eg as parity bits), but portable C code will never be able to detect this. 实现可能会也可能不会修改填充位(或在内部使用它们,例如作为奇偶校验位),但便携式C代码将永远无法检测到这一点。 Any value (including UINT_MAX ) will not include the padding. 任何值(包括UINT_MAX )都不包括填充。

Where integer padding might lead to problems on is if you use things like sizeof (int) * CHAR_BIT and then try to use shifts to access all these bits. 如果您使用sizeof (int) * CHAR_BIT ,然后尝试使用shift来访问所有这些位,那么整数填充可能会导致问题。 If you want to be portable, either only use ( unsigned ) char , fixed-sized integers (a C99 addition) or determine the number of value-bits programatically. 如果你想要是可移植的,要么只使用( unsignedchar ,固定大小的整数(添加C99),要么以编程方式确定值位数。 This can be done at compile-time with the preprocessor by comparing UINT_MAX against powers of 2 or at runtime by using bit-operations. 这可以在编译时使用预处理器通过比较UINT_MAX与2的幂或在运行时通过使用位操作来完成。

edit: 编辑:

C90 does not mention integer padding at all, but as far as I can tell, 'invisible' preceding or trailing integer padding bits shouldn't violate the standard (I didn't go through all relevant sections to make sure this is really the case, though); C90根本没有提到整数填充,但据我所知,“隐形”前置或尾随整数填充位不应违反标准(我没有通过所有相关部分来确保这是真的情况虽然); there probaby are problems with mixed padding and value bits as mentioned in the C99 rationale because otherwise, the standard would not have needed to be changed. probaby是C99基本原理中提到的混合填充和值位的问题,因为否则,标准就不需要改变了。

As to the meaning of user-accessible: Padding bits are accessible insofar as you can alwaye get at any bit of foo (including padding) by using bit-operations on ((unsigned char *)&foo)[…] . 至于用户可访问的含义:填充位是可访问的,只要你可以通过在((unsigned char *)&foo)[…]上使用位操作来获取foo (包括填充)的任何位。 Be careful when modifying the padding bits, though: the result won't change the value of the integer, but might create be a trap-representation nevertheless. 但是在修改填充位时要小心:结果不会改变整数的值,但可能会创建陷阱表示。 In case of C90, this is implicitly unspecified (as in not mentioned at all), in case of C99, it's implementation-defined. 在C90的情况下,这是隐式未指定的(如未提及的那样),在C99的情况下,它是实现定义的。

This was not what the rationale quotation was about, though: the cited architecture represents 32-bit integers via two 16-bit integers. 然而,这不是基本原理引用的内容:引用的体系结构通过两个16位整数表示32位整数。 In case of unsigned types, the resulting integer has 32 value bits and a precision of 32; 在无符号类型的情况下,结果整数具有32个值位并且精度为32; in case of signed integers, it only has 31 value bits and a precision of 30: one of the sign bits of the 16-bit integers is used as the sign bit of the 32-bit integer, the other one is ignored, thus creating a padding bit surrounded by value bits. 在有符号整数的情况下,它只有31个值位且精度为30:16位整数的符号位之一用作32位整数的符号位,另一个被忽略,从而创建由值位包围的填充位。 Now, if you access a 32-bit signed integer as an unsigned integer (which is explicitly allowed and does not violate the C99 aliasing rules), the padding bit becomes a (user-accessible) value bit. 现在,如果您将32位有符号整数作为无符号整数(明确允许并且不违反C99别名规则)访问,则填充位将成为(用户可访问的)值位。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM