[英]What happens when casting `int*` to pointer to bitset
I am doing a simple test, calculating the number of 1's in a number's binary representation:我正在做一个简单的测试,计算数字二进制表示中 1 的数量:
int x;
while (cin >> x) {
bitset<32> xBitmap = {0};
xBitmap = static_cast<bitset<32>>(x);
std::cout << xBitmap.count() << std::endl;
}
The above code creates the right result, but when I use a pointer to a bitset, something unexpected happens:上面的代码创建了正确的结果,但是当我使用指向位集的指针时,会发生意想不到的事情:
bitset<32>* xBitmap = nullptr;
xBitmap = static_cast<bitset<32>*>((void*)&x);
std::cout << xBitmap->count() << std::endl;
This code creates random results, every use of "count()" creates a different result.此代码创建随机结果,每次使用“count()”都会创建不同的结果。 I am guessing this is a memory leak?我猜这是 memory 泄漏? But why would it cause a memory leak?但为什么会导致 memory 泄漏?
You have a variable of type int
and you perform a static_cast
to convert the int
to a std::bitset<32>
.您有一个int
类型的变量,并执行static_cast
以将int
转换为std::bitset<32>
。 From the specification of static_cast
( link ):从static_cast
的规范(链接):
static_cast<new_type>(expression)
- If there is an implicit conversion sequence from
expression
tonew_type
, or if overload resolution for a direct initialization of an object or reference of typenew_type
fromexpression
would find at least one viable function, thenstatic_cast<new_type>(expression)
returns the imaginary variableTemp
initialized as if bynew_type Temp(expression);
如果存在从expression
到new_type
的隐式转换序列,或者如果对 object 的直接初始化或从expression
中引用new_type
类型的重载决议将找到至少一个可行的 function,则static_cast<new_type>(expression)
返回虚构变量Temp
初始化为new_type Temp(expression);
, which may involve implicit conversions, a call to the constructor ofnew_type
or a call to a user-defined conversion operator. ,这可能涉及隐式转换、对new_type
的构造函数的调用或对用户定义的转换运算符的调用。... ...
Consider the following example:考虑以下示例:
#include <iostream>
class A {
public:
A(int x) { std::cout << "A " << x << std::endl; }
};
int main(void) {
int y = 13;
A a = static_cast<A>(y);
}
Running this program will print A 13
.运行此程序将打印A 13
。 This means that in this case A a = static_cast<A>(y)
is equivalent to A a = A(y)
.这意味着在这种情况下A a = static_cast<A>(y)
等效于A a = A(y)
。 This is because y
is of type int
and there is a constructor for A
that takes an int
.这是因为y
是int
类型,并且A
的构造函数采用int
。
If we would change the example so that the constructor for A
takes an std::string
, the program would no longer compile:如果我们更改示例以便A
的构造函数采用std::string
,则程序将不再编译:
#include <iostream>
#include <string>
class A {
public:
A(std::string x) { std::cout << "A " << x << std::endl; }
};
int main(void) {
int y = 13;
A a = static_cast<A>(y);
}
The compiler would complain about being unable to convert an int
to A
.编译器会抱怨无法将int
转换为A
。
Consider a third example:考虑第三个例子:
#include <iostream>
class A {
public:
A(int x) { std::cout << "A " << x << std::endl; }
};
class B {
public:
B(A a) { std::cout << "B" << std::endl; }
};
int main(void) {
int y = 13;
B b = static_cast<B>(y);
}
This example compiles and prints:此示例编译并打印:
A 13
B
So this would be what the specification calls an "implicit conversion sequence".所以这就是规范所说的“隐式转换序列”。 While there is no constructor for B
that takes an int
, there is a constructor for B
that takes an A
and then there is a constructor for A
that takes an int
.虽然B
没有采用int
的构造函数,但B
有一个采用A
的构造函数,然后是A
的一个采用int
的构造函数。 So static_cast<B>(y)
would resolve to to B(A(x))
.所以static_cast<B>(y)
将解析为B(A(x))
。 If we would add the explicit
keyword to the constructor for A
, then the example would no longer compile:如果我们将explicit
关键字添加到A
的构造函数,则该示例将不再编译:
explicit A(int x) { std::cout << "A " << x << std::endl; }
This is because the explicit
keyword on a constructor forbids the constructor from being used in a implicit conversion sequence.这是因为构造函数上的explicit
关键字禁止在隐式转换序列中使用构造函数。
These examples allow us to understand what is happening when we call static_cast<std::bitset<32>>(x)
.这些示例让我们了解调用static_cast<std::bitset<32>>(x)
时发生了什么。 The std::bitset<N>
class has a constructor that takes an unsigned long
( reference ). std::bitset<N>
class 有一个构造函数,它采用unsigned long
整数(参考)。 The constructor is not marked with the explicit
keyword, so it can participate in an implicit conversion sequence.构造函数没有使用explicit
关键字标记,因此它可以参与隐式转换序列。 An int
can be implicitly converted to an unsigned long
. int
可以隐式转换为unsigned long
。 So static_cast<std::bitset<32>>(x)
resolves to std::bitset<32>((unsigned long)x))
, so it creates a new instance of std::bitset<32>
with the value of x
passed to the constructor.所以static_cast<std::bitset<32>>(x)
解析为std::bitset<32>((unsigned long)x))
,所以它创建一个新的std::bitset<32>
实例,其值为x
传递给构造函数。
This is why your first example works.这就是您的第一个示例有效的原因。
You have a variable of type int
.您有一个int
类型的变量。 You create a pointer to this variable ( &x
) and then you cast the pointer to a void
pointer.您创建一个指向此变量 ( &x
) 的指针,然后将指针转换为void
指针。 Then you static_cast
the void
pointer to a std::bitset<32>
pointer.然后你将void
指针static_cast
转换为std::bitset<32>
指针。 From the specification of static_cast
( link ):从static_cast
的规范(链接):
- A prvalue of type pointer to
void
(possibly cv-qualified) can be converted to pointer to any object type.指向void
的指针类型的纯右值(可能是 cv 限定的)可以转换为指向任何 object 类型的指针。
So unlike your first example, your second example will not create a new instance of std::bitmap<32>
.因此,与您的第一个示例不同,您的第二个示例不会创建std::bitmap<32>
的新实例。 Rather, xBitmap
points to the memory address of x
, but interprets this memory as an std::bitmap<32>
.相反, xBitmap
指向x
的 memory 地址,但将此 memory 解释为std::bitmap<32>
。 However, there is a problem with that: The memory size of a std::bitmap<32>
may not be equal to the memory size of an int
.但是,有一个问题: std::bitmap<32>
的 memory 大小可能不等于int
的 memory 大小。 This is implementation-specific, so different implementations of the C++ standard library may have different sizes for std::bitmap<32>
.这是特定于实现的,因此 C++ 标准库的不同实现对于std::bitmap<32>
可能具有不同的大小。
On my system, using the C++ standard library that comes with GCC, the following code will print 8
:在我的系统上,使用 GCC 附带的 C++ 标准库,以下代码将打印8
:
std::cout << sizeof(std::bitset<32>) << std::endl;
This means that a std::bitset<32>
takes 8 bytes of memory.这意味着std::bitset<32>
占用 8 个字节的 memory。 While the 32 bit could of course be represented by only 4 bytes, it seems that on my system the std::bitset
will always allocate multiples of 8 bytes (ie unsigned long).虽然 32 位当然只能用 4 个字节表示,但在我的系统上, std::bitset
似乎总是会分配 8 个字节的倍数(即 unsigned long)。 So for example sizeof(std::bitset<1>)
is also 8
, and so is sizeof(std::bitset<64>)
, but then sizeof(std::bitset<65>)
is 16 and so is sizeof(std::bitset<128>)
, but then sizeof(std::bitset<129>)
is 24 and so on.因此,例如sizeof(std::bitset<1>)
也是8
, sizeof(std::bitset<64>)
也是如此,但是sizeof(std::bitset<65>)
是 16 , sizeof(std::bitset<128>)
,然后sizeof(std::bitset<129>)
是 24 等等。
Whereas (on my system), an int
takes only four bytes.而(在我的系统上),一个int
只需要四个字节。 So when we take the memory of an int
but interpret it as std::bitmap<32>
, we would read 8 bytes (the size of an std::bitmap<32>
) from a memory allocation that is only of size 4 bytes.因此,当我们获取int
的 memory 并将其解释为std::bitmap<32>
时,我们将从仅大小为 4 字节的 memory 分配中读取 8 个字节( std::bitmap<32>
的大小) . So we would read an additional four bytes after the memory of the int
.因此,我们将在int
的 memory 之后再读取四个字节。 There could be anything in this memory, so the read results in undefined behavior.这个 memory 中可能有任何内容,因此读取会导致未定义的行为。 This is why you get the random values when you call count()
.这就是为什么在调用count()
时会得到随机值的原因。 It will count the number of bits in the int
, but also the number of bits in the four bytes after that.它将计算int
中的位数,以及之后四个字节中的位数。
Modern compilers such as GCC and Clang have a feature called "Address Sanitization" ( ASan ), which can help you debug such memory issues.现代编译器,例如 GCC 和 Clang 具有称为“地址清理”( ASan )的功能,可以帮助您调试此类 memory 问题。 For GCC, it can be enabled with the -fsanitize=address
flag:对于 GCC,可以使用-fsanitize=address
标志启用它:
$ g++ -fsanitize=address test.cpp
$ ./a.out
123
=================================================================
==16616==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd1c7c5f80 at pc 0x55b34fd334b3 bp 0x7ffd1c7c5f00 sp 0x7ffd1c7c5ef0
READ of size 8 at 0x7ffd1c7c5f80 thread T0
So in this case, address sanitization detects that your program attempts to read past the size of an allocation.因此,在这种情况下,地址清理会检测到您的程序尝试读取超出分配大小的内容。
So with regards to the part of your question about memory leaks: This is not a memory leak, but a buffer overflow.因此,关于您关于 memory 泄漏的部分问题:这不是 memory 泄漏,而是缓冲区溢出。 A memory leak would be when you allocate memory and then forget to free the memory.当您分配 memory 然后忘记释放 memory 时,就会出现 memory 泄漏。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.