[英]How to write Unicode string to file with UTF-8 BOM by C++?
I can use ofstream to write to UTF-8 BOM file. 我可以使用ofstream写入UTF-8 BOM文件。 I can also write Unicode string to file using wofstream and imbue with
utf8_locale
( codecvt_utf8
). 我还可以使用wofstream将Unicode字符串写入文件,并使用
utf8_locale
( codecvt_utf8
) codecvt_utf8
。 However, I cannot find out how to write Unicode string to file with UTF-8 BOM encoding. 但是,我不知道如何将Unicode字符串写入具有UTF-8 BOM编码的文件。
BOM is just first optional bytes at the beginning of the file to specify its encoding. BOM只是文件开头的第一个可选字节,用于指定其编码。 it has nothing to do directly to
std::fstream
as fstream
is just a file stream for reading and writing random bytes/characters. 它与
std::fstream
没有直接关系,因为fstream
只是用于读取和写入随机字节/字符的文件流。
you just need to manually write the BOM before you continue writing your utf8 encoded string. 您只需要手动编写BOM表,然后再继续编写utf8编码的字符串。
unsigned uint8_t utf8BOM[] = {0xEF,0xBB,0xBF};
fileStream.write(utf8BOM,sizeof(utf8BOM));
//write the rest of the utf8 encoded string..
The example below works fine in VS 2015 or new gcc compilers: 下面的示例在VS 2015或新的gcc编译器中正常运行:
#include <iostream>
#include <string>
#include <fstream>
#include <codecvt>
int main()
{
std::string utf8 = u8"日本医療政策機構\nPhở\n";
std::ofstream f("c:\\test\\ut8.txt");
unsigned char bom[] = { 0xEF,0xBB,0xBF };
f.write((char*)bom, sizeof(bom));
f << utf8;
return 0;
}
In older versions of Visual Studio you have to declare UTF16 string (with L
prefix), then convert from UTF16 to UTF8: 在旧版本的Visual Studio中,您必须声明UTF16字符串(带有
L
前缀),然后从UTF16转换为UTF8:
#include <iostream>
#include <string>
#include <fstream>
#include <Windows.h>
std::string get_utf8(const std::wstring &wstr)
{
if (wstr.empty()) return std::string();
int sz = WideCharToMultiByte(CP_UTF8, 0, &wstr[0], (int)wstr.size(), 0, 0, 0, 0);
std::string res(sz, 0);
WideCharToMultiByte(CP_UTF8, 0, &wstr[0], (int)wstr.size(), &res[0], sz, 0, 0);
return res;
}
std::wstring get_utf16(const std::string &str)
{
if (str.empty()) return std::wstring();
int sz = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), 0, 0);
std::wstring res(sz, 0);
MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &res[0], sz);
return res;
}
int main()
{
std::string utf8 = get_utf8(L"日本医療政策機構\nPhở\n");
std::ofstream f("c:\\test\\ut8.txt");
unsigned char bom[] = { 0xEF,0xBB,0xBF };
f.write((char*)bom, sizeof(bom));
f << utf8;
return 0;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.