简体   繁体   English

将 `int*` 转换为指向 bitset 的指针时会发生什么

[英]What happens when casting `int*` to pointer to bitset

I am doing a simple test, calculating the number of 1's in a number's binary representation:我正在做一个简单的测试,计算数字二进制表示中 1 的数量:

  int x;
  while (cin >> x) {
      bitset<32> xBitmap = {0};
      xBitmap = static_cast<bitset<32>>(x);
      std::cout << xBitmap.count() << std::endl;
  }

The above code creates the right result, but when I use a pointer to a bitset, something unexpected happens:上面的代码创建了正确的结果,但是当我使用指向位集的指针时,会发生意想不到的事情:

      bitset<32>* xBitmap = nullptr;
      xBitmap = static_cast<bitset<32>*>((void*)&x);
      std::cout << xBitmap->count() << std::endl;

This code creates random results, every use of "count()" creates a different result.此代码创建随机结果,每次使用“count()”都会创建不同的结果。 I am guessing this is a memory leak?我猜这是 memory 泄漏? But why would it cause a memory leak?但为什么会导致 memory 泄漏?

What is happening in your first example?您的第一个示例中发生了什么?

You have a variable of type int and you perform a static_cast to convert the int to a std::bitset<32> .您有一个int类型的变量,并执行static_cast以将int转换为std::bitset<32> From the specification of static_cast ( link ):static_cast的规范(链接):

static_cast<new_type>(expression)

  1. If there is an implicit conversion sequence from expression to new_type , or if overload resolution for a direct initialization of an object or reference of type new_type from expression would find at least one viable function, then static_cast<new_type>(expression) returns the imaginary variable Temp initialized as if by new_type Temp(expression);如果存在从expressionnew_type的隐式转换序列,或者如果对 object 的直接初始化或从expression中引用new_type类型的重载决议将找到至少一个可行的 function,则static_cast<new_type>(expression)返回虚构变量Temp初始化为new_type Temp(expression); , which may involve implicit conversions, a call to the constructor of new_type or a call to a user-defined conversion operator. ,这可能涉及隐式转换、对new_type的构造函数的调用或对用户定义的转换运算符的调用。

... ...

Consider the following example:考虑以下示例:

#include <iostream>

class A {
public:
  A(int x) { std::cout << "A " << x << std::endl; }
};

int main(void) {
  int y = 13;
  A a = static_cast<A>(y);
}

Running this program will print A 13 .运行此程序将打印A 13 This means that in this case A a = static_cast<A>(y) is equivalent to A a = A(y) .这意味着在这种情况下A a = static_cast<A>(y)等效于A a = A(y) This is because y is of type int and there is a constructor for A that takes an int .这是因为yint类型,并且A的构造函数采用int

If we would change the example so that the constructor for A takes an std::string , the program would no longer compile:如果我们更改示例以便A的构造函数采用std::string ,则程序将不再编译:

#include <iostream>
#include <string>

class A {
public:
  A(std::string x) { std::cout << "A " << x << std::endl; }
};

int main(void) {
  int y = 13;
  A a = static_cast<A>(y);
}

The compiler would complain about being unable to convert an int to A .编译器会抱怨无法将int转换为A

Consider a third example:考虑第三个例子:

#include <iostream>

class A {
public:
  A(int x) { std::cout << "A " << x << std::endl; }
};

class B {
public:
  B(A a) { std::cout << "B" << std::endl; }
};

int main(void) {
  int y = 13;
  B b = static_cast<B>(y);
}

This example compiles and prints:此示例编译并打印:

A 13
B

So this would be what the specification calls an "implicit conversion sequence".所以这就是规范所说的“隐式转换序列”。 While there is no constructor for B that takes an int , there is a constructor for B that takes an A and then there is a constructor for A that takes an int .虽然B没有采用int的构造函数,但B有一个采用A的构造函数,然后是A的一个采用int的构造函数。 So static_cast<B>(y) would resolve to to B(A(x)) .所以static_cast<B>(y)将解析为B(A(x)) If we would add the explicit keyword to the constructor for A , then the example would no longer compile:如果我们将explicit关键字添加到A的构造函数,则该示例将不再编译:

  explicit A(int x) { std::cout << "A " << x << std::endl; }

This is because the explicit keyword on a constructor forbids the constructor from being used in a implicit conversion sequence.这是因为构造函数上的explicit关键字禁止在隐式转换序列中使用构造函数。

These examples allow us to understand what is happening when we call static_cast<std::bitset<32>>(x) .这些示例让我们了解调用static_cast<std::bitset<32>>(x)时发生了什么。 The std::bitset<N> class has a constructor that takes an unsigned long ( reference ). std::bitset<N> class 有一个构造函数,它采用unsigned long整数(参考)。 The constructor is not marked with the explicit keyword, so it can participate in an implicit conversion sequence.构造函数没有使用explicit关键字标记,因此它可以参与隐式转换序列。 An int can be implicitly converted to an unsigned long . int可以隐式转换为unsigned long So static_cast<std::bitset<32>>(x) resolves to std::bitset<32>((unsigned long)x)) , so it creates a new instance of std::bitset<32> with the value of x passed to the constructor.所以static_cast<std::bitset<32>>(x)解析为std::bitset<32>((unsigned long)x)) ,所以它创建一个新的std::bitset<32>实例,其值为x传递给构造函数。

This is why your first example works.这就是您的第一个示例有效的原因。

What is happening in your second example?你的第二个例子发生了什么?

You have a variable of type int .您有一个int类型的变量。 You create a pointer to this variable ( &x ) and then you cast the pointer to a void pointer.您创建一个指向此变量 ( &x ) 的指针,然后将指针转换为void指针。 Then you static_cast the void pointer to a std::bitset<32> pointer.然后你将void指针static_cast转换为std::bitset<32>指针。 From the specification of static_cast ( link ):static_cast的规范(链接):

  1. A prvalue of type pointer to void (possibly cv-qualified) can be converted to pointer to any object type.指向void的指针类型的纯右值(可能是 cv 限定的)可以转换为指向任何 object 类型的指针。

So unlike your first example, your second example will not create a new instance of std::bitmap<32> .因此,与您的第一个示例不同,您的第二个示例不会创建std::bitmap<32>的新实例。 Rather, xBitmap points to the memory address of x , but interprets this memory as an std::bitmap<32> .相反, xBitmap指向x的 memory 地址,但将此 memory 解释为std::bitmap<32> However, there is a problem with that: The memory size of a std::bitmap<32> may not be equal to the memory size of an int .但是,有一个问题: std::bitmap<32>的 memory 大小可能不等于int的 memory 大小。 This is implementation-specific, so different implementations of the C++ standard library may have different sizes for std::bitmap<32> .这是特定于实现的,因此 C++ 标准库的不同实现对于std::bitmap<32>可能具有不同的大小。

On my system, using the C++ standard library that comes with GCC, the following code will print 8 :在我的系统上,使用 GCC 附带的 C++ 标准库,以下代码将打印8

std::cout << sizeof(std::bitset<32>) << std::endl;

This means that a std::bitset<32> takes 8 bytes of memory.这意味着std::bitset<32>占用 8 个字节的 memory。 While the 32 bit could of course be represented by only 4 bytes, it seems that on my system the std::bitset will always allocate multiples of 8 bytes (ie unsigned long).虽然 32 位当然只能用 4 个字节表示,但在我的系统上, std::bitset似乎总是会分配 8 个字节的倍数(即 unsigned long)。 So for example sizeof(std::bitset<1>) is also 8 , and so is sizeof(std::bitset<64>) , but then sizeof(std::bitset<65>) is 16 and so is sizeof(std::bitset<128>) , but then sizeof(std::bitset<129>) is 24 and so on.因此,例如sizeof(std::bitset<1>)也是8sizeof(std::bitset<64>)也是如此,但是sizeof(std::bitset<65>)是 16 , sizeof(std::bitset<128>) ,然后sizeof(std::bitset<129>)是 24 等等。

Whereas (on my system), an int takes only four bytes.而(在我的系统上),一个int只需要四个字节。 So when we take the memory of an int but interpret it as std::bitmap<32> , we would read 8 bytes (the size of an std::bitmap<32> ) from a memory allocation that is only of size 4 bytes.因此,当我们获取int的 memory 并将其解释为std::bitmap<32>时,我们将从仅大小为 4 字节的 memory 分配中读取 8 个字节( std::bitmap<32>的大小) . So we would read an additional four bytes after the memory of the int .因此,我们将在int的 memory 之后再读取四个字节。 There could be anything in this memory, so the read results in undefined behavior.这个 memory 中可能有任何内容,因此读取会导致未定义的行为。 This is why you get the random values when you call count() .这就是为什么在调用count()时会得到随机值的原因。 It will count the number of bits in the int , but also the number of bits in the four bytes after that.它将计算int中的位数,以及之后四个字节中的位数。

Modern compilers such as GCC and Clang have a feature called "Address Sanitization" ( ASan ), which can help you debug such memory issues.现代编译器,例如 GCC 和 Clang 具有称为“地址清理”( ASan )的功能,可以帮助您调试此类 memory 问题。 For GCC, it can be enabled with the -fsanitize=address flag:对于 GCC,可以使用-fsanitize=address标志启用它:

$ g++ -fsanitize=address test.cpp
$ ./a.out
123
=================================================================
==16616==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd1c7c5f80 at pc 0x55b34fd334b3 bp 0x7ffd1c7c5f00 sp 0x7ffd1c7c5ef0
READ of size 8 at 0x7ffd1c7c5f80 thread T0

So in this case, address sanitization detects that your program attempts to read past the size of an allocation.因此,在这种情况下,地址清理会检测到您的程序尝试读取超出分配大小的内容。

So with regards to the part of your question about memory leaks: This is not a memory leak, but a buffer overflow.因此,关于您关于 memory 泄漏的部分问题:这不是 memory 泄漏,而是缓冲区溢出。 A memory leak would be when you allocate memory and then forget to free the memory.当您分配 memory 然后忘记释放 memory 时,就会出现 memory 泄漏。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM