简体   繁体   English

将uint16_t转换为wchar_t的安全方法

[英]Safe way to cast a uint16_t to a wchar_t

Trying to clean up some code and i wanted to know if the following is a safe way to cast uint16_t to a wchar_t. 试图清理一些代码,我想知道以下是否是将uint16_t转换为wchar_t的安全方法。

#if ! defined(MARKUP_SIZEOFWCHAR)
#if __SIZEOF_WCHAR_T__ == 4 || __WCHAR_MAX__ > 0x10000
#define MARKUP_SIZEOFWCHAR 4
#else
#define MARKUP_SIZEOFWCHAR 2
#endif

void FileReader::parseBuffer(char * buffer, int length)
{
  //start by looking for a vrsn
  //Header seek around for a vrns followed by 32 bit size descriptor
  //read 32 bits at a time
  int cursor = 0;
  char vrsn[5] = "vrsn";
  cursor = this->searchForMarker(cursor, length, vrsn, buffer);
  int32_t size = this->getObjectSizeForMarker(cursor, length, buffer);
  cursor = cursor + 7; //advance cursor past marker and size
  wchar_t *version = this->getObjectForSizeAndCursor(size, cursor, buffer);
  wcout << version;
  delete[] version; //this pointer is dest from getObjectForSizeAndCursor
}

- --

wchar_t* FileReader::getObjectForSizeAndCursor(int32_t size, int cursor, char *buffer) {

  int wlen = size/2;
  uint32_t *dest = new uint32_t[wlen+1];
  unsigned char *ptr = (unsigned char *)(buffer + cursor);
  for(int i=0; i<wlen; i++) {
    #if MARKUP_SIZEOFWCHAR == 4 // sizeof(wchar_t) == 4
      char padding[2] = {'\0','\0'}; 
      dest[i] =  (padding[0] << 24) + (padding[1] << 16) + (ptr[0] << 8) + ptr[1];
    #else // sizeof(wchar_t) == 2
      dest[i] = (ptr[0] << 8) + ptr[1];
    #endif
      ptr += 2;
      cout << ptr;
  }
  return (wchar_t *)dest;
}

do i have any scoping issues with the way i am using the padding? 我使用填充的方式有任何范围问题吗? will i leak padding when i delete dest[] in the calling function? 在调用函数中delete dest[]时,是否会泄漏填充?

The distinction 区别

#if MARKUP_SIZEOFWCHAR == 4 // sizeof(wchar_t) == 4
  char padding[2] = {'\0','\0'}; 
  dest[i] =  (padding[0] << 24) + (padding[1] << 16) + (ptr[0] << 8) + ptr[1];
#else // sizeof(wchar_t) == 2
  dest[i] = (ptr[0] << 8) + ptr[1];
#endif

is completely unnecessary. 完全没有必要。 padding[i] is 0, so shifting that left keeps it 0, and adding it has no effect. padding[i]为0,因此向左移动将其保持为0,将其相加则无效。

The compiler may or may not optimise the allocation of the two-byte array padding in each loop iteration away, but since it is an automatic array, it cannot leak in any way. 在每次循环迭代中,编译器可能会优化也可能不会优化两字节数组padding的分配,但是由于它是自动数组,因此不能以任何方式泄漏。

Since the types used in the loop are unsigned, simply using 由于循环中使用的类型是无符号的,因此只需使用

dest[i] = (ptr[0] << 8) + ptr[1];

is perfectly safe. 非常安全。 (The endianness must of course be correct.) (字节序当然必须正确。)

For 对于

return (wchar_t *)dest;

you should let the type of dest depend on the size of wchar_t , it should be uint16_t* if sizeof(wchar_t) == 2 (and CHAR_BIT == 8 ). 您应该让dest的类型取决于wchar_t的大小,如果sizeof(wchar_t) == 2 (并且CHAR_BIT == 8 ),则应该为uint16_t*

What you're trying to do isn't going to work. 您要尝试做的事不会起作用。 It's broken in several ways, but let's focus on the cast. 它以多种方式被破坏,但让我们关注一下演员表。

Your question doesn't match your code. 您的问题与您的代码不匹配。 Your code is using a uint32_t , while your question asks about a uint16_t . 您的代码正在使用uint32_t ,而您的问题询问的是uint16_t But that doesn't matter, because neither will work . 但这没关系,因为两者都不起作用

If you need to use wchar_t , then you should actually use wchar_t . 如果需要使用wchar_t ,那么实际上应该使用 wchar_t If your goal is to take two consecutive bytes of a char* and copy them into the first-two bytes of a wchar_t , then just do that. 如果您的目标是获取char*两个连续字节并将其复制到wchar_t的前两个字节,则只需执行此操作。

Here is a much better version of your code, one that actually works (to the degree that it makes sense to copy two bytes from a char* and pretend that it's a wchar_t ): 这是代码的一个更好的版本,一个实际可行的版本 (在某种程度上可以从char*复制两个字节并假装它是wchar_t ):

std::wstring FileReader::getObjectForSizeAndCursor(int32_t size, int cursor, char *buffer) {

  int wlen = size/2;
  std::wstring out(wlen);
  unsigned char *ptr = (unsigned char *)(buffer + cursor);
  for(int i=0; i<wlen; i++) {
    out[i] = (ptr[0] << 8) + ptr[1];
    ptr += 2;
    cout << ptr;
  }
  return out;
}

Plus, there's no chance of memory leaking since we're using a proper RAII class like std::wstring . 另外,由于我们使用的是诸如std::wstring类的适当RAII类,因此不会发生内存泄漏的情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM