convert a UTF8 string to a UTF16 string in c++

Question

I am working with VC 6.0. My project is compiled in Unicode. I am using zlib 1.1.3 for inflating a file which contains my UTF-8 string. I am getting it in ASCII but i have a guarantee that it is all in English so i can relate to it as a UTF8 string (can i?).

I have used the suggested function in Codeproject as follow:

WCHAR* SMUUTF8toUTF16(LPCSTR utf8, int* pLen)
{
    WCHAR *ptr = NULL;
    *pLen = MultiByteToWideChar(CP_UTF8, 0, utf8, -1, NULL, 0);
    if (*pLen>1)
    {
        ptr = (WCHAR*)malloc(*pLen);

        if (ptr)
        {
            MultiByteToWideChar(CP_UTF8, 0, utf8, -1, ptr, *pLen);
        }
    }

    return ptr;
}

My code became unstable with these errors : 1. Critical error detected c0000374 2. First-chance exception in w3wp.exe (NTDLL.DLL): 0xC0000005: Access Violation.

I suspect that there is a memory leak or bad pointer being referenced because while using this function I get a lot of the above mentioned error. My tests also indicate that when I don't use it the heap stays well formed and not corrupted.

Can you please suggest a better implementation to this problem?

Answer 1

MultiByteToWideChar returns the number of 16-bit Unicode characters in the output -- not the number of bytes. But malloc requires the number of bytes. You must multiply the number of characters by the byte-size of a character, otherwise you are allocating only half the number of bytes that you need!

ptr = (WCHAR *)malloc(sizeof(WCHAR) * *pLen);

convert a UTF8 string to a UTF16 string in c++

Question

1 answers

solution1
6 ACCPTED 2013-02-20 07:06:30

convert a UTF8 string to a UTF16 string in c++

Question

1 answers

solution1 6 ACCPTED 2013-02-20 07:06:30

solution1
6 ACCPTED 2013-02-20 07:06:30