简体   繁体   中英

Returning 16-bits given a pointer in C

A chunk is represented by a 64-bit long integer, which is broken into 4 16-bit sections.

I need to return a 16-bit section using the function below.

unsigned short get_16bitsection(unsigned long *start, int index) {
// Fill this in 
}

It is tempting to use casts to achieve this, but it is a common misconception that "everything is just bytes" and thus that you can do that safely. A rule called strict aliasing actually prohibits doing so. Your code may appear to work, particularly on older and less sophisticated compilers, but in the age of heavy optimisations you are really playing with fire by violating the language rules like that.

Instead, you should copy the bytes you need into a uint16_t , then return it:

uint16_t get_16bitsection(uint64_t *start, int index) {
  uint16_t result;
  memcpy(&result, (char*)start + index*sizeof(uint16_t), sizeof(uint16_t));
  return result;
}

Here I cast to char* so that we can navigate byte-wise through your chunk (this aliasing is a specifically permitted exception to the usual strict-aliasing rule), then apply an offset of index*sizeof(uint16_t) to reach the desired index (assuming little endian, which you have specified). Finally, we copy the bytes into result , and return it.

If you're concerned about performance, don't be. You were already copying a uint16_t from local scope into the calling scope; just now it has a name. And if this function is any slower than the aliasing-violating version, then that's evidence that you've confused the optimiser into going too far.

Just use a union.

long int x=0x123456789abcdef0;

union {
    long int x;
    unsigned short arr[4];
} c;

c.x = x;
printf("%04x %04x %04x %04x\n", c.arr[0], c.arr[1], c.arr[2], c.arr[3]);

Result:

def0 9abc 5678 1234

Returning 16-bits given a pointer
A chunk is represented by a 64-bit long integer, which is broken into 4 16-bit sections

To access the data in a endian independent portable way and retrieve the 0:LS 16-bit to 3:MS 16-bit, use >> .


As unsigned long may only be 32-bit, recommend unsigned long long or uint_least64_t .
Consider making pointer const to allow this function use on const data.

unsigned short get_16bitsection(const unsigned long long *start, int index) {
  #define MASK_16BIT 0xFFFFu
  return MASK_16BIT & (*start >> (16*index));
} 
  • Mask useful on rare machines where unsigned short is not 16 bit. IAC, I prefer mask over casts - gentler way to reduce range.
  • Alternatively use a cast: (unsigned short) or (uint16_t) though this is slightly less portable as uint16_t may not exist and unsigned short may be > 16-bit.

Maybe I'm missing the point here but it could be as easy as this:

unsigned short get_16bitsection_be(unsigned long *start, int index) {
  unsigned short *p = (unsigned short*) start;

  return p[3 - index];
}

unsigned short get_16bitsection_le(unsigned long *start, int index) {
  unsigned short *p = (unsigned short*) start;

  return p[index];
}

Where the difference between big and little endian is relevant here.

Note you should consider using stdint.h to give these types more meaningful names and make it clear what you're actually doing:

uint16_t get_16bitsection_le(uint64_t *start, int index) {
  uint16_t *p = (uint16_t*) start;

  return p[index];
}

uint16_t get_16bitsection_be(uint64_t *start, int index) {
  uint16_t *p = (uint16_t*) start;

  return p[3 - index];
}

You were on the right track with your second approach, but that code is heavily cluttered by a lot of things that don't matter, plus the * 8 offset which makes no sense.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM