简体   繁体   English

转换为uint64时,int32或32bit指针的意外符号扩展

[英]Unexpected sign extension of int32 or 32bit pointer when converted to uint64

I compiled this code using Visual Studio 2010 ( cl.exe /W4 ) as a C file: 我使用Visual Studio 2010( cl.exe /W4 )将此代码编译为C文件:

int main( int argc, char *argv[] )
{
    unsigned __int64 a = 0x00000000FFFFFFFF;
    void *orig = (void *)0xFFFFFFFF;
    unsigned __int64 b = (unsigned __int64)orig;
    if( a != b )
        printf( " problem\ta: %016I64X\tb: %016I64X\n", a, b );
    return;
}

There are no warnings and the result is: 没有警告,结果是:

problem a: 00000000FFFFFFFF b: FFFFFFFFFFFFFFFF 问题a:00000000FFFFFFFF b:FFFFFFFFFFFFFFFF

I suppose int orig = (int)0xFFFFFFFF would be less controversial as I'm not assigning a pointer to an integer. 我想int orig = (int)0xFFFFFFFF会引起争议,因为我没有指定一个整数的指针。 However the result would be the same. 但结果是一样的。

Can someone explain to me where in the C standard it is covered that orig is sign extended from 0xFFFFFFFF to 0xFFFFFFFFFFFFFFFF? 有人可以向我解释在C标准中它覆盖了orig是从0xFFFFFFFF扩展到0xFFFFFFFFFFFFFFFF的符号吗?

I had assumed that (unsigned __int64)orig would become 0x00000000FFFFFFFF. 我原以为(unsigned __int64)orig会变成0x00000000FFFFFFFF。 It appears that the conversion is first to the signed __int64 type and then it becomes unsigned? 似乎转换首先是签名的__int64类型,然后它变为无符号?

EDIT: This question has been answered in that pointers are sign extended which is why I see this behavior in gcc and msvc. 编辑:这个问题已被回答,指针是符号扩展,这就是为什么我在gcc和msvc中看到这种行为。 However I don't understand why when I do something like (unsigned __int64)(int)0xF0000000 it sign extends to 0xFFFFFFFFF0000000 but (unsigned __int64)0xF0000000 does not instead showing what I want which is 0x00000000F0000000. 但是我不明白为什么当我执行类似(unsigned __int64)(int)0xF0000000它的符号扩展到0xFFFFFFFFF0000000但是(unsigned __int64)0xF0000000并没有反而显示我想要的是0x00000000F0000000。

EDIT: An answer to the above edit. 编辑:上述编辑的答案。 The reason that (unsigned __int64)(int)0xF0000000 is sign extended is because, as noted by user R : (unsigned __int64)(int)0xF0000000符号扩展的原因是因为,如用户R所述

Conversion of a signed type (or any type) to an unsigned type always takes place via reduction modulo one plus the max value of the destination type. 有符号类型(或任何类型)到无符号类型的转换总是通过减少模1加上目标类型的最大值来进行。

And in (unsigned __int64)0xF0000000 0xF0000000 starts off as an unsigned integer type because it cannot fit in an integer type. 并且在(unsigned __int64)0xF0000000 0xF0000000作为无符号整数类型开始,因为它不能适合整数类型。 Next that already unsigned type is converted unsigned __int64 . 接下来,已经无符号的类型转换为unsigned __int64

So the takeaway from this for me is with a function that's returning a 32-bit or 64-bit pointer as an unsigned __int64 to compare I must first convert the 32-bit pointer in my 32-bit application to an unsigned type before promoting to unsigned __int64 . 因此,对我来说,这是一个函数,它返回一个32位或64位指针作为unsigned __int64进行比较我必须首先将32位应用程序中的32位指针转换为无符号类型,然后再将其提升为unsigned __int64 The resulting code looks like this (but, you know, better): 结果代码看起来像这样(但是,你知道,更好):

unsigned __int64 functionidontcontrol( char * );
unsigned __int64 x;
void *y = thisisa32bitaddress;
x = functionidontcontrol(str);
if( x != (uintptr_t)y )



EDIT again: Here is what I found in the C99 standard: 6.3.1.3 Signed and unsigned integers 再次编辑:这是我在C99标准中找到的:6.3.1.3有符号和无符号整数

  • 1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged. 1当具有整数类型的值转换为除_Bool之外的另一个整数类型时,如果该值可以由新类型表示,则它将保持不变。
  • 2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.49) 2否则,如果新类型是无符号的,则通过重复加或减一个可以在新类型中表示的最大值来转换该值,直到该值在新类型的范围内.49)
  • 3 Otherwise, the new type is signed and the value cannot be represented in it; 3否则,新类型已签名且值无法在其中表示; either the result is implementation-defined or an implementation-defined signal is raised. 结果是实现定义的,或者引发实现定义的信号。
  • 49) The rules describe arithmetic on the mathematical value, not the value of a given type of expression. 49)规则描述了数学值的算术,而不是给定类型表达式的值。

Converting a pointer to/from an integer is implementation defined. 将指针转换为/从整数转换是实现定义的。

Here is how gcc does it, ie it sign extends if the integer type is larger than the pointer type(this'll happen regardless of the integer being signed or unsigned, just because that's how gcc decided to implement it). 下面是gcc如何做到这一点,即如果整数类型大于指针类型,则符号会扩展(无论整数是有符号还是无符号,都会发生这种情况,只是因为这是gcc决定实现它的方式)。

Presumably msvc behaves similar. 据推测,msvc表现相似。 Edit, the closest thing I can find on MSDN is this / this , suggesting that converting 32 bit pointers to 64 bit also sign extends. 编辑,我在MSDN上找到的最接近的东西就是这个 / this ,这表明将32位指针转换为64位也符号扩展。

From the C99 standard (§6.3.2.3/6): 根据C99标准(§6.3.2.3/ 6):

Any pointer type may be converted to an integer type. 任何指针类型都可以转换为整数类型。 Except as previously specified, the result is implementation-defined . 除了之前指定的以外, 结果是实现定义的 If the result cannot be represented in the integer type, the behavior is undefined. 如果结果无法以整数类型表示,则行为未定义。 The result need not be in the range of values of any integer type. 结果不必在任何整数类型的值范围内。

So you'll need to find your compiler's documentation that talks about that. 所以你需要找到你的编译器文档来讨论它。

Integer constants (eg, 0x00000000FFFFFFFF ) are signed integers by default, and hence may experience sign extension when assigned to a 64-bit variable. 整数常量(例如, 0x00000000FFFFFFFF )默认为有符号整数,因此在分配给64位变量时可能会出现符号扩展。 Try replacing the value on line 3 with: 尝试将第3行的值替换为:

0x00000000FFFFFFFFULL

Use this to avoid the sign extension: 使用它来避免符号扩展:

unsigned __int64 a = 0x00000000FFFFFFFFLL;

Note the L on the end. 注意最后的L. Without this it is interpreted as a 32-bit signed number (-1) and then cast. 如果没有它,它将被解释为32位有符号数(-1),然后进行转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM