简体   繁体   English

指针数组中的转换和写入报告使用clang消毒器的地址未对齐

[英]Casting and writing in pointer array reports misaligned address with clang sanitizer

I'm using a char* array to store different data types, like in the next example: 我正在使用char *数组存储不同的数据类型,如下面的示例所示:

int main()
{
    char* arr = new char[8];
    *reinterpret_cast<uint32_t*>(&arr[1]) = 1u;
    return 0;
}

Compiling and running with clang UndefinedBehaviorSanitizer will report the following error: 使用clang UndefinedBehaviorSanitizer进行编译和运行将报告以下错误:

runtime error: store to misaligned address 0x602000000011 for type 'uint32_t' (aka 'unsigned int'), which requires 4 byte alignment

I suppose I could do it another way, but why is this undefined behavior? 我想我可以用另一种方式来做,但是为什么这种不确定的行为呢? What concepts are involved here? 这里涉及什么概念?

You cannot cast an arbitrary char* to uint32_t* , even if it points to an array large enough to hold a uint32_t 您不能将任意char*强制转换为uint32_t* ,即使它指向的数组足以容纳uint32_t

There are a couple reasons why. 为什么有两个原因。

The practical answer: 实际答案:

uint32_t generally likes 4-byte alignment: its address should be a multiple of 4. uint32_t通常喜欢4字节对齐:其地址应为4的倍数。

char does not have such a restriction. char没有这样的限制。 It can live at any address. 它可以生活在任何地址。

That means that an arbitrary char* is unlikely to be aligned properly for a uint32_t . 这意味着对于uint32_t ,任意char*不太可能正确对齐。

The Language Lawyer answer: 语言律师回答:

Aside from the alignment issue, your code exhibits undefined behavior because you're violating the strict aliasing rules. 除了对齐问题之外,您的代码还表现出未定义的行为,因为您违反了严格的别名规则。 No uint32_t object exists at the address you're writing to, but you're treating it as if there is one there. 您要写入的地址处不存在uint32_t对象,但您将其视为好像存在一个对象。

In general, while char* may be used to point to any object and read its byte representation, a T* for any given type T , cannot be used to point at an array of bytes and write the byte-representation of the object into it. 通常,虽然char*可用于指向任何对象并读取其字节表示形式,但任何给定类型TT* 不能用于指向字节数组并将对象的字节表示形式写入其中。


No matter the reason for the error, the way to fix it is the same: 不管错误的原因是什么,解决方法都是相同的:

If you don't care about treating the bytes as a uint32_t and are just serializing them (to send over a network, or write to disk, for example), then you can std::copy the bytes into the buffer: 如果您不关心将字节当作uint32_t对待,而只是对其进行序列化(例如,通过网络发送或写入磁盘),则可以将std::copy字节std::copy到缓冲区中:

char buffer[BUFFER_SIZE] = {};
char* buffer_pointer = buffer;
uint32_t foo = 123;
char* pfoo = reinterpret_cast<char*>(&foo);
std::copy(pfoo, pfoo + sizeof(foo), buffer_pointer);
buffer_pointer += sizeof(foo);
uint32_t bar = 234;
char* pbar = reinterpret_cast<char*>(&bar);
std::copy(pbar, pbar + sizeof(bar), buffer_pointer);
buffer_pointer += sizeof(bar);
// repeat as needed

If you do want to treat those bytes as a uint32_t (if you're implementing a std::vector -like data structure, for example) then you will need to ensure the buffer is properly-aligned, and use placement-new: 如果确实要将这些字节视为uint32_t (例如,如果要实现类似std::vector的数据结构),则需要确保缓冲区正确对齐,并使用new-place:

std::aligned_storage_t<sizeof(uint32_t), alignof(uint32_t)> buffer[BUFFER_SIZE];
uint32_t foo = 123;
uint32_t* new_uint = new (&buffer[0]) uint32_t(foo);
uint32_t bar = 234;
uint32_t* another_new_uint = new (&buffer[1]) uint32_t(foo);
// repeat as needed

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM