[英]How to output and input UTF8 or UTF16 Unicode text in Windows using C++?
This is my program:这是我的程序:
#include <iostream>
#include <string>
#include <locale>
#include <clocale>
#include <codecvt>
#include <io.h>
#include <fcntl.h>
int main()
{
fflush(stdout);
_setmode(_fileno(stdout), _O_U16TEXT);
std::ios_base::sync_with_stdio(false);
std::setlocale(LC_ALL, "el_GR.utf8");
std::locale loc{ "el_GR.utf8" };
std::locale::global(loc); // apparently this does not set the global locale
//std::wcout.imbue(loc);
//std::wcin.imbue(loc);
std::wstring yes;
std::wcout << L"It's all good γεια ναί" << L'\n';
std::wcin >> yes;
std::wcout << yes << L'\n';
return 0;
}
Lets say I want to support greek encodings (for both input and output).假设我想支持希腊编码(输入和输出)。 This program works perfectly on Linux for various output and input languages if I set the appropriate encoding and of course remove the
fflush(stdout)
and _setmode()
.如果我设置了适当的编码并且当然删除了
fflush(stdout)
和_setmode()
,那么这个程序可以在 Linux 上完美地运行各种输出和输入语言。
So on Windows this program will output greek (and english) correctly when I use std::locale::global(loc)
, but It will not take greek input that I type from the keyboard.因此,在 Windows 上,当我使用
std::locale::global(loc)
时,该程序将正确输出希腊语(和英语),但它不会接受我从键盘输入的希腊语输入。 The std::wcout << yes
outputs gibberish or question marks if I type greek.如果我输入希腊语,
std::wcout << yes
输出乱码或问号。 Apparently ::global
isn't really global on Windows?显然
::global
在 Windows 上并不是真正的全局?
So I tried the .imbue()
method on wcout
and wcin
(which also works on Linux) that you see commented out here.因此,我在
wcout
和wcin
(也适用于 Linux)上尝试了.imbue()
方法,您在此处看到注释掉了。 When I use any of these two statements and run the program it will (compile properly) present me with a prompt and when I press w/e and then press 'enter' it simply exits with no errors or whatnot.当我使用这两个语句中的任何一个并运行程序时,它将(正确编译)向我显示提示,当我按 w/e 然后按“输入”时,它只会退出而没有错误或诸如此类。
I have tried a few Windows specific commands but then I got confused too.我尝试了一些特定于 Windows 的命令,但后来我也感到困惑。 What should I try and when on Windows is not clear to me.
我应该在 Windows 上尝试什么以及何时尝试对我来说不是很清楚。
So the question is how I can both input and output greek text properly in Windows like in the program above?所以问题是我如何才能像上面的程序一样在 Windows 中正确输入和输出希腊文本? I use MSVS 2017 latest updates.
我使用 MSVS 2017 最新更新。 Thanks in advance.
提前致谢。
As @Eryk Sun mentioned in the comments I had to use _setmode(_fileno(stdin), _O_U16TEXT);
正如@Eryk Sun 在评论中提到的,我不得不使用
_setmode(_fileno(stdin), _O_U16TEXT);
Windows UTF-8 console inputs is still (as of 2019) somewhat broken. Windows UTF-8 控制台输入仍然(截至 2019 年)有些损坏。
EDIT:编辑:
The above modification wasn't enough.上面的修改是不够的。 I now do the following whenever I want to support UTF-8 code page and UNICODE input/output on Windows (read the code comments for more info).
现在,每当我想在 Windows 上支持 UTF-8 代码页和 UNICODE 输入/输出时,我都会执行以下操作(阅读代码注释以获取更多信息)。
int main()
{
fflush( stdout );
#if defined _MSC_VER
# pragma region WIN_UNICODE_SUPPORT_MAIN
#endif
#if defined _WIN32
// change code page to UTF-8 UNICODE
if ( !IsValidCodePage( CP_UTF8 ) )
{
return GetLastError();
}
if ( !SetConsoleCP( CP_UTF8 ) )
{
return GetLastError();
}
if ( !SetConsoleOutputCP( CP_UTF8 ) )
{
return GetLastError();
}
// change console font - post Windows Vista only
HANDLE hStdOut = GetStdHandle( STD_OUTPUT_HANDLE );
CONSOLE_FONT_INFOEX cfie;
const auto sz = sizeof( CONSOLE_FONT_INFOEX );
ZeroMemory( &cfie, sz );
cfie.cbSize = sz;
cfie.dwFontSize.Y = 14;
wcscpy_s( cfie.FaceName,
L"Lucida Console" );
SetCurrentConsoleFontEx( hStdOut,
false,
&cfie );
// change file stream translation mode
_setmode( _fileno( stdout ), _O_U16TEXT );
_setmode( _fileno( stderr ), _O_U16TEXT );
_setmode( _fileno( stdin ), _O_U16TEXT );
#endif
#if defined _MSC_VER
# pragma endregion
#endif
std::ios_base::sync_with_stdio( false );
// program:...
return 0;
}
Guidelines:准则:
string
and 8 bit char
s.string
和 8 位char
s。char
s ( wchar_t
, wstring
etc.) to interact with the Windows consolechar
( wchar_t
、 wstring
等)与 Windows 控制台交互char
s/ string
at application boundary (eg write to files, interact with other OSs etc.)char
/ string
(例如写入文件、与其他操作系统交互等)string
|string
| char
to wstring
| char
到wstring
| wchar_t
for interacting with the Windows APIs wchar_t
用于与 Windows API 交互I have written a small C++ library that allows UTF-8 input as well as output on the Windows console.我编写了一个小型 C++ 库,它允许在 Windows 控制台上输入和输出 UTF-8。 You can use cin >>, getline(), scanf(), etc with Unicode UTF-8.
您可以将 cin >>、getline()、scanf() 等与 Unicode UTF-8 一起使用。
https://github.com/Jalopy-Tech/WUTF8Console https://github.com/Jalopy-Tech/WUTF8Console
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.