conversion from unsigned char* to const wchar_t*

Question

I am using the following code to convert a string from unsigned char* to const wchar_t* . The error I am getting is that only a few words are being converted properly while the rest is garbled value.

CODE

unsigned char* temp = fileUtils->getFileData("levels.json", "r", &size);
const char* temp1 = reinterpret_cast<const char*>(temp);
size_t len = mbstowcs(nullptr, &temp1[0], 0);
if (len == -1) {

} else {
    wchar_t* levelData = new wchar_t();
    mbstowcs(&levelData[0], &temp1[0], size*10);
}

OUTPUT

temp1 = "[{"scaleFactor": 1}][{"scaleFactor": 2}][{"scaleFactor": 3}][{"scaleFactor": 4}][{"scaleFactor": 5}][{"scaleFactor": 6}][{"scaleFactor": 7}][{"scaleFactor": 8}][{"scaleFactor": 9}][{"scaleFactor": 10}]"

levelData = "[{"scaleFactor": 1}][{"scaleFactor": 2}][{"scaleFactor": 3}][{"scaleFactor": 4}][{"scaleFactor": 5}][{"scaleFactor": 6}][{"scaleFactor": 7}][{"s慣敬慆瑣牯㨢㠠嵽筛猢慣敬慆瑣牯㨢㤠嵽筛猢慣敬慆瑣牯㨢ㄠ細ﵝ﷽꯽ꮫꮫꮫﺫﻮ"

Answer 1

wchar_t* levelData = new wchar_t();
mbstowcs(&levelData[0], &temp1[0], size*10);

That allocated enough memory for exactly ONE character. That's not enough to store your string, so of course things will not work right.

Also, where'd that 10 come from?

Answer 2

You don't need to hard code the buffer size if you're going to allocate it dynamically (with new).

wchar_t* levelData = new wchar_t[len+1];
mbstowcs(&levelData[0], &temp1[0], len);

Answer 3

Thanks to @BenVoigt, found the mistake. Changed the code to this-

wchar_t levelData[200];
mbstowcs(&levelData[0], &temp1[0], size);

Answer 4

unsigned char* temp = fileUtils->getFileData("levels.json", "r", &size);
const char* temp1 = reinterpret_cast<const char*>(temp);

wchar_t* levelData = new wchar_t[size];
int last_char_size = 0;

mbtowc(NULL, 0, 0);
for (wchar_t* position = levelData; size > 0; position++)
{
    last_char_size = mbtowc(position, temp1, size);
    if (last_char_size <= 0) break;
    else {
        temp1 += last_char_size;
        size -= last_char_size;
    }
}

if (last_char_size == -1)
{
    std::cout  << "Invalid encoding" << std::endl;
}

delete[] temp; // * probably

The marked line (*) depends on, whether the fileUtils->getFileData allocates a memory block for temp and the object of fileUtils does not manage it by its own. -- Which is most probable. However you should check the documentation.

The size should be perfectly enough size for the levelData array, while whithin [] you specify the number of elements of the array, not the number of bytes(aka char s). - In this case, it is the number of wide characters. Which can not be more, then read char s.

Another thing you should be aware, the fileUtils->getFileData probably reads binary date. So the text in temp is not followed by 0. So, later string functions - like wcstok - called on it, will shoot your foot off.

And one another. If you are not familiar with the construction

    function_on_arrays( target,  source,  size )

Remember your program in C/C++ don't know the sizes of target and source . But probably, you do not want to the function do something beyond them. So this is what for the size mainly is. - Your manual way to say, on how many elements you want to perform the action to not go beyond the arrays's data.

Edit: The earlier solution was wrong, as mistakenly treating the last parameter of mbstowcs as the number of characters in the source.

conversion from unsigned char* to const wchar_t*

Question

4 answers

solution1
2 2013-05-17 12:27:40

solution2
1 ACCPTED 2013-05-17 12:41:37

solution3
0 2013-05-17 12:38:43

solution4
0 2013-05-17 20:31:33

conversion from unsigned char* to const wchar_t*

Question

4 answers

solution1 2 2013-05-17 12:27:40

solution2 1 ACCPTED 2013-05-17 12:41:37

solution3 0 2013-05-17 12:38:43

solution4 0 2013-05-17 20:31:33

solution1
2 2013-05-17 12:27:40

solution2
1 ACCPTED 2013-05-17 12:41:37

solution3
0 2013-05-17 12:38:43

solution4
0 2013-05-17 20:31:33