简体   繁体   English

为什么“从'X *'转换为'Y'会失去精度”这是一个难以解决的错误,什么是遗留代码的合适修复

[英]Why is “cast from ‘X*’ to ‘Y’ loses precision” a hard error and what is suitable fix for legacy code

1. Why? 1.为什么?

Code like this used to work and it's kind of obvious what it is supposed to mean. 像这样的代码过去常常起作用,它应该是什么意思。 Is the compiler even allowed (by the specification) to make it an error? 是否允许编译器(通过规范)使其成为错误?

I know that it's loosing precision and I would be happy with a warning. 我知道它正在失去精确度,我会很高兴收到警告。 But it still has a well-defined semantics (at least for unsigned downsizing cast is defined) and the user just might want to do it. 但它仍然有一个明确定义的语义(至少对于未定义的缩小版本定义)并且用户可能想要这样做。

2. Workaround 2.解决方法

I have legacy code that I don't want to refactor too much because it's rather tricky and already debugged. 我有遗留代码,我不想重构太多,因为它相当棘手并且已经调试过。 It is doing two things: 它做了两件事:

  1. Sometimes stores integers in pointer variables. 有时在指针变量中存储整数。 The code only casts the pointer to integer if it stored an integer in it before. 如果代码之前存储了整数,则代码仅将指针强制转换为整数。 Therefore while the cast is downsizing, the overflow never happens in reality. 因此,当演员阵容缩小时,溢出在现实中永远不会发生。 The code is tested and works. 代码经过测试和运行。

    When integer is stored, it always fits in plain old unsigned, so changing the type is not considered a good idea and the pointer is passed around quite a bit, so changing it's type would be somewhat invasive. 当存储整数时,它总是适合普通的旧无符号,因此改变类型不被认为是一个好主意并且指针传递相当多,所以更改它的类型将有点侵入性。

  2. Uses the address as hash value. 使用地址作为哈希值。 A rather common thing to do. 一个相当常见的事情。 The hash table is not that large to make any sense to extend the type. 哈希表对于扩展类型没有任何意义。

    The code uses plain unsigned for hash value, but note that the more usual type of size_t may still generate the error , because there is no guarantee that sizeof(size_t) >= sizeof(void *) . 代码使用普通的unsigned作为哈希值,但请注意,更常见的size_t类型仍可能生成错误 ,因为无法保证sizeof(size_t) > = sizeof(void *) On platforms with segmented memory and far pointers, size_t only has to cover the offset part. 在具有分段内存和远指针的平台上, size_t只需要覆盖偏移部分。

So what are the least invasive suitable workarounds? 那么什么是最不易侵入的合适的解决方法? The code is known to work when compiled with compiler that does not produce this error, so I really want to do the operation, not change it. 已知代码在使用不会产生此错误的编译器编译时工作,所以我真的想要进行操作,而不是更改它。


Notes: 笔记:

void *x;
int y;
union U { void *p; int i; } u;
  1. *(int*)&x and up = x, ui are not equivalent to (int)x and are not the opposite of (void *)y . *(int*)&xup = x, ui 等于(int)x ,与(void *)y 相反。 On big endian architectures, the first two will return the bytes on lower addresses while the later will work on low order bytes, which may reside on higher addresses. 在大端架构上,前两个将返回较低地址上的字节,而后者将工作在低位字节上,这些字节可能位于较高地址上。
  2. *(int*)&x and up = x, ui are both strict aliasing violations, (int)x is not . *(int*)&xup = x, ui都是严格的别名冲突, (int)x 不是

C++, 5.2.10: C ++,5.2.10:

4 - A pointer can be explicitly converted to any integral type large enough to hold it. 4 - 指针可以显式转换为足以容纳它的任何整数类型。 [...] [...]

C, 6.3.2.3: C,6.3.2.3:

6 - Any pointer type may be converted to an integer type. 6 - 任何指针类型都可以转换为整数类型。 [...] If the result cannot be represented in the integer type, the behavior is undefined. [...]如果结果无法以整数类型表示,则行为未定义。 [...] [...]

So (int) p is illegal if int is 32-bit and void * is 64-bit; 因此,如果int为32位且void *为64位,则(int) p是非法的; a C++ compiler is correct to give you an error, while a C compiler may either give an error on translation or emit a program with undefined behaviour. 一个C ++编译器是正确的给你一个错误,而C编译器可能会给出一个错误的翻译或发出一个具有未定义行为的程序。

You should write, adding a single conversion: 你应该写,添加一个转换:

(int) (intptr_t) p

or, using C++ syntax, 或者,使用C ++语法,

static_cast<int>(reinterpret_cast<intptr_t>(p))

If you're converting to an unsigned integer type, convert via uintptr_t instead of intptr_t . 如果要转换为无符号整数类型,请通过uintptr_t而不是intptr_t

This is a tough one to solve "generically", because the "looses precision" indicates that your pointers are larger than the type you are trying to store it in. Which may well be "ok" in your mind, but the compiler is concerned that you will be restoring the int value back into a pointer, which has now lost the upper 32 bits (assuming we're talking 32-bit int and 64-bit pointers - there are other possible combinations). 这是一个很难解决“一般”的问题,因为“松散精度”表示你的指针大于你试图存储它的类型。在你的脑海中可能很“好”,但是编译器是关心的您将把int值恢复为指针,现在已经丢失了高32位(假设我们正在讨论32位int和64位指针 - 还有其他可能的组合)。

There is uintptr_t that is size-compatible with whatever the pointer is on the systems, so typically, you can overcome the actual error by: uintptr_t与系统上的指针大小兼容,因此通常可以通过以下方式克服实际错误:

int x = static_cast<int>(reinterpret_cast<uintptr_t>(some_ptr));

This will first force a large integer from a pointer, and then cast the large integer to a smaller type. 这将首先从指针强制大整数,然后将大整数强制转换为较小的类型。

Answer for C 回答C

Converting pointers to integers is implementation defined. 将指针转换为整数是实现定义的。 Your problem is that the code that you are talking about seems never have been correct. 您的问题是您所谈论的代码似乎永远不正确。 And probably only worked on ancient architectures where both int and pointers are 32 bit. 并且可能只适用于int和指针都是32位的古代架构。

The only types that are supposed to convert without loss are [u]intptr_t , if they exist on the platform (usually they do). 唯一应该转换而没有丢失的类型是[u]intptr_t ,如果它们存在于平台上(通常是它们)。 Which part of such an uintptr_t is appropriate to use for your hash function is difficult to tell, you shouldn't make any assumptions on that. 这种uintptr_t哪一部分适合用于你的哈希函数很难说,你不应该对此做出任何假设。 I would go for something like 我会选择类似的东西

uintptr_t n = (uintptr_t)x;

and then 接着

((n >> 32) ^ n) & UINT32_MAX

this can be optimized out on 32 bit archs, and would give you traces of all other bits on 64 bit archs. 这可以在32位拱上进行优化,并为您提供64位拱上所有其他位的跟踪。

For C++ basically the same should apply, just the cast would be reinterpret_cast<std:uintptr_t>(x) . 对于C ++,基本上应该适用相同的,只是reinterpret_cast<std:uintptr_t>(x)将是reinterpret_cast<std:uintptr_t>(x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM