简体   繁体   中英

What is *(uint32_t *) &buffer[index]?

In some supposed-C++ code I found, I have buffer defined as const void *buffer; (it's arbitrary binary data that, I think, gets interpreted as a stream of 32-bit unsigned integers) and in many places, I have

*(uint32_t *) &buffer[index]

where index is some kind of integer (I think it was long or unsigned long and got swept up in my replacing those with int32_t and uint32_t when I was making the code work on a 64-bit system).

I recognize that this is taking the address of buffer ( &buffer ), casting it as a pointer to a uint32_t , and dereferencing that, at least based on this question ... but then I'm confused by how the [index] part interacts with that or where I missed inserting the [index] part in between the steps I listed.

What, conceptually, is this doing? Is there some way I could define another variable to be a better type, with the casting there once, and then use that, rather than having this complicated expression throughout the code? Is this actually C++ or is this C99?

edit: The first couple of lines of the code are:

const void *buffer = data.bytes;
if (ntohl(*(int32_t *) buffer) != 'ttcf') {
    return;
}
uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]);

where data.bytes has type const void * . Before I was getting buffer from data.bytes , it was char * .

edit 2: Apparently, having const void *buffer work is not normal C (though it absolutely works in my situation), so if it makes more sense, assume it's const char *buffer .

Putting parenthesis in place to make the order of operations more explicit:

*((uint32_t *) &(buffer[index]))

So you're treating buffer as an array, however because buffer is a void * you can't dereference it directly.

Assuming you want to treat this buffer as an array of uint32_t , what you want to do is this:

((uint32_t *)buffer)[index]

Which can also be written as:

*((uint32_t *)buffer + index)

EDIT:

If index is the byte offset in the buffer, that changes things. In that case, I'd recommend defining the buffer as const char * instead of const void * . That way, you can be sure the dereferencing of the array is working properly.

So to break down the expression:

*(uint32_t *) &buffer[index]

You're going index bytes into buffer : buffer[index]

Then taking the address of that byte: &buffer[index]

Then casting that address to a uint32_t : (uint32_t *) &buffer[index]

Then dereferencing the uint32_t value: *(uint32_t *) &buffer[index]

Lots of issues here! First of all, a void * cannot be dereferenced. buffer[index] is illegal in ISO C, although some compilers apparently have an extension that will treat it as (void)((char *)buffer)[index] .

You suggest in comments that the code originally used char * - I recommend you leave it that way. Assuming buffer returns to being const char * :

if (ntohl(*(int32_t *) buffer) != 'ttcf') { return; }

The intent here is to pretend that the first four bytes of buffer contain an integer; read that integer, and compare it to 'ttcf' . The latter is a multibyte character constant, the behaviour of which is implementation-defined . It could represent four characters 't', 't', 'c', 'f' , or 'f', 'c', 't', 't' , or in fact anything else at all of type int .

A greater problem is that pretending a buffer contains an int when it did not actually get written via an expression of type int violates the strict aliasing rule . This is unfortunately a common technique in older code, but even since the first C standard it has caused undefined behaviour. If you use a compiler that performs type-based aliasing optimization it could wreck your code.

A way to write this code avoiding both of those problems is:

if ( memcmp(buffer, "ttcf", 4) ) { return; }

The later line uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]); has similar issues. In this case there is no doubt that the best fix is:

uint32_t ttf_count; 
memcpy(&ttf_count, buffer + 0x08, sizeof ttf_count);
ttf_count = ntohl(ttf_count);

As discussed in comments, you could make an inline function to keep this tidy. In my own code I do something like:

static inline uint32_t be_to_uint32(void const *ptr)
{
    unsigned char const *p = ptr;
    return p[0] * 0x1000000ul + p[1] * 0x10000ul + p[2] * 0x100 + p[3];
}

and a similar version le_to_uint32 that reads bytes in the opposite order; then I use whichever of those corresponds to the input format instead of using ntohl .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM