简体   繁体   中英

Output utf-8 to console

I have this code to print an utf-8 char to windows console:

SetConsoleOutputCP(65001);
freopen(NULL, "w,ccs=UTF-8", stdout);

wchar_t u16 = 0x00A9;
unsigned char utf8_b[] = {0xc2, 0xa9, 0x0}; //same as using WideCharToMultibyte for u16
printf("%s", utf8_b); //(1)
wprintf(L"%c", u16); //(2)

(1) produces correct output which is '©', while (2)'s output is the replacement character U+FFFD. I tried redirecting stdout (2) to a file to see if there's a problem with encoding conversion but it products the same byte sequences as utf8_b[].

Can anyone explain to me why is that ? Is this a windows problem ?

btw, my console font is already set to Consolas.

edit: I comment (1) before using (2), so I think it doesn't related to stream orientation here. I've read somewhere that some implementation bugs in windows code page 65001 can affect C standard IO. Can anyone confirm this for me ?

Mixing wide and byte oriented output on the same FILE stream invokes undefined behavior. You should try instead using printf("%lc", u16); or eliminating all byte-oriented output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM