简体   繁体   中英

C++ UTF-8/ASCII to UTF-16 in MFC

How can I convert a (text) file from UTF-8/ASCII to UTF-16 before it will be displaying in a MFC program? Because MFC uses 16 bits per character and the most (text) files on windows use UTF-8 or ASCII.

The simple answer is called MultiByteToWideChar and WideCharToMultiByte to do the reverse conversion. There's also CW2A and CA2W that are a little simpler to use.

However, I would strongly recommand against using these functions directly. You have the pain of handling character buffers manually with the risk of creating memory corruption or security holes.

It's much better to use a library based on std::string and/or iterators. For example, utf8cpp . This one has the advantage to be small, header-only and multiplatform.

Actually, you can do it very simply, using the CStdioFile and CString classes provided by MFC . The MFC library is a very powerful and comprehensive one (albeit notwithstanding some major oddities, and even bugs); but, if you're already using it, then use it to its fullest extent:

...
const wchar_t* inpPath = L"<path>\\InpFile.txt"; // These values are given just...
const wchar_t* outPath = L"<path>\\outFile.txt"; // ... for illustrative purposes!
CStdioFile inpFile(inpPath, CFile::modeRead | CFile::typeText);
CStdioFile outFile(outPath, CFile::modeWrite | CFile::modeCreate | CFile::typeText
    | CFile::typeUnicode); // Note the Unicode flag - will create UTF-16LE file!
CString textBuff;
while (inpFile.ReadString(textBuff)) {
    outFile.WriteString(textBuff);
    outFile.WriteString(L"\n");
}
inpFile.Close();
outFile.Close();
...

Of course, you will need to change the code (a bit) if you want the input and output files to have the same path, but that wouldn't mean changing the basic premise!

With this approach, there is no concern for any library calls to convert character strings - just let MFC do it for you, when it's reading/writing it's (Unicode) CString object!

Note: Compiled and tested with MSVC (VS-2019), 64-bit, in Unicode mode.

EDIT: Maybe I misunderstood your question, If you don't want to actually convert the file, but just display the contents, then take away all references in my code to outFile and just do stuff with each textBuffer object you read. The CString class takes care of all the required ASCII/UTF-8/UTF-16LE conversions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM