简体   繁体   中英

Is there any way of accessing arbitrary data of known size as a char array in a constexpr/consteval context?

I'm trying to implement something that will take in arbitrary bits of data (which are known at compile time) and calculate their CRC as a consteval , so I can use it to eg index such data with integer keys without any runtime overhead. I have it working when the input is a char string literal, but I'm struggling to make it work when the input is a wchar_t string literal.

I'm getting a fairly cryptic error...

error: accessing value of '"T\000e\000s\000t\000\000"' through a 'const char' glvalue in a constant expression

... which seems to be caused by using reinterpret_cast within a constexpr context (which is apparently not allowed)

My question is, is there any way of interpreting arbitrary data as a plain old array of bytes anyway? I don't care how ugly or lacking in portability it is (as long as it all happens at compile time). For now, just solving the case with an array of wchar_t as input would be enough. Obviously, I could "just" reimplement the CRC calculations for each type I want to handle separately, but I would rather not do that if at all possible (and indeed it would be quite tricky for anything more complex than an array of POD)

For reference, the failing code is as follows:

// Details of CRCInternal omitted for brevity
template <size_t len> consteval uint32_t CRC32(const char (&str)[len])
{
    return CRCInternal::crc32<len - 1>(str) ^ 0xFFFFFFFFu;
}

template <size_t len> consteval uint32_t CRC32FromWide(const wchar_t (&filename)[len])
{
    return CRC32(reinterpret_cast<const char(&)[len * sizeof(wchar_t)]>(filename));
}

void main()
{
    CRC32FromWide(L"Test"); // <==== Error
}

The C++ object model is usually a fiction, an agreement between the programmer writing the code and the compiler generating the binary executable. To the executable, objects don't exist; it's just bits stored in memory. As such, you can exploit the fact that C++ has dozens of back-doors that can be used to effectively pretend that the object model isn't real. Many of these are stated to exhibit undefined behavior, but no compiler is going to check for these violations of the object model and stop you. You broke your end of the contract, but the compiler wasn't paying attention, so you get away with it.

This is not the case in constant expression evaluation. A compiled executable runs on the CPU; constant expression evaluation runs within the compiler. The object model doesn't have to map to "bits" or "memory" or anything like it; it can be a real object model with full lifetime tracking and analysis.

The C++ standard therefore requires that, during constant evaluation, if you do anything that exhibits UB, the compiler must detect this and declare your program ill-formed. Also, constexpr code is just flat-out forbidden from using the biggest back-door of all: reinterpret_cast .

At compile-time, objects aren't bytes in storage. So you don't get to treat them as if they were.

This is especially important because the execution environment of the compiler and the execution environment of the eventual binary don't have to be the same . If you're doing development for some embedded system, the endian of the CPU you're targeting may not match the endian of the CPU that your compiler executes on. So if you were able to access any compile-time data as just bytes, you'd get a different answer at compile-time than you would at runtime.

That's bad.

C++20's std::bit_cast exists and can help, but even that can't do everything. A type is only suitable for constexpr bit_cast -ing if it is TriviallyCopyable and does not store pointers (among other things). This is because compile-time pointers aren't just addresses; they're some complex data type that has to remember what object it points to (otherwise, it would be impossible to detect when you static_cast them to some unrelated type and attempt to access the object through the wrong type).

But if you restrict your types to those which are constexpr bit_cast able, then you can bit_cast them to an array of their size.

Note that constexpr bit_cast is not the easiest thing to implement precisely because it has to make the source object data work as if it were executing on the target CPU and environment, not the one the compiler is executing within. So if the target is a big-endian machine and the source is little-endian, constexpr bit_cast must do endian conversion, and it must do such conversion with the specific knowledge of what each component type of the source and destination objects are.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM