[英]Printing unicode character in MASM32 assembly
I am trying to print a unicode character in MASM32 assembly, but I can't make it work.我试图在 MASM32 程序集中打印一个 unicode 字符,但我无法让它工作。 Here is a reproducible example:
这是一个可重现的例子:
.DATA
output db "%x Hello",10,0
unicode DWORD "∟", 0
.DATA?
.CODE
start:
push offset unicode
push offset output
call crt_printf
invoke ExitProcess, 0
end start
Current output: 40300a Hello
当前 output:
40300a Hello
Expected output: ∟ Hello
预期 output:
∟ Hello
You're using %x
to print a number as hex digits, and the number you're passing is an address.您正在使用
%x
将数字打印为十六进制数字,而您传递的数字是一个地址。 So 0x40300a
is the address of the unicode
label in your .data
section.所以
0x40300a
是.data
部分中unicode
label 的地址。
%s
should probably work, if it and the output terminal support the same encoding that your editor and assembler used.如果 %s 和 output 终端支持与您的编辑器和汇编器使用的编码相同的编码,则
%s
应该可以工作。 It should just copy bytes from the address you pass, until reaching a 0
, so it should Just Work for UTF-8. But not for UTF-16, if there's a 0
byte somewhere in there.它应该只从您传递的地址复制字节,直到到达
0
,因此它应该只适用于 UTF-8。但不适用于 UTF-16,如果其中某处有0
字节。 %ls
could work if supported, treating the arg as a wchar_t*
string.如果支持,
%ls
可以工作,将 arg 视为wchar_t*
字符串。
If you wanted to pass a word or dword as a wide-character for %lc
, you'd push dword ptr [unicode]
.如果你想将一个单词或双字作为宽字符传递给
%lc
,你push dword ptr [unicode]
。 Maybe.或许。 In ISO C99 and C++,
%lc
takes an int
arg, and prints it like it would a wchar_t[2]
string (I think with the 2nd element being a terminating 0, if that's what cppreference means ).在 ISO C99 和 C++ 中,
%lc
接受一个int
arg,并像wchar_t[2]
字符串一样打印它(我认为第二个元素是终止 0,如果那是 cppreference 的意思)。 But Microsoft has persistently declined to support standard C and C++ features, especially around printf, so who knows what crt_printf
supports.但是微软一直拒绝支持标准的 C 和 C++ 特性,尤其是 printf 附近,所以谁知道
crt_printf
支持什么。
The desired character seems to be └ Box Drawings Light Up and Right想要的角色好像是└ 方块图 Light Up and Right
which has encoding 0xE29494
in UTF-8 alias 0x1425
in UTF-16LE.它在
0x1425
中编码为0xE29494
,在 UTF-16LE 中编码为 0x1425。
I don't know how did your texteditor encoded the source line unicode DWORD "∟", 0
but the (unreproducible for me) function crt_printf
seems to not cope with it.我不知道您的文本编辑器是如何对源代码行
unicode DWORD "∟", 0
进行编码的,但是(对我来说无法重现)function crt_printf
似乎无法应对。
MS Windows works with UTF-16LE, you'll need WinAPI function WriteConsoleW and define the lpBuffer as MS Windows 使用 UTF-16LE,您需要 WinAPI function WriteConsoleW并将 lpBuffer 定义为
unicode db 14h,25h
output dw " ","H","e","l","l","o",10,0
nNumberOfCharsToWrite EQU ($-unicode)/2 ; Number of 16bit characters.
Questions related to Microsoft Macro Assembler have a dedicated MASM Forum here .与 Microsoft Macro Assembler 相关的问题在此处有专门的 MASM 论坛。
Printing Unicode strings might be easier in other assemblers, for instance with macro StdOutput in €ASM在其他汇编程序中打印 Unicode 字符串可能更容易,例如在 €ASM 中使用宏StdOutput
rchg PROGRAM Format=PE,Entry=start
INCLUDE winapi.htm
[.data]
Buffer DB 14h,25h
DU " Hello",10,0
[.text]
start: StdOutput Buffer,Console=yes,Unicode=yes
WinAPI ExitProcess, 0
ENDPROGRAM
The previous source compiles and works fine:以前的源代码编译并工作正常:
R:\>euroasm.exe rchg.asm
I0010 EuroAssembler version 20191104 started.
I0020 Current directory is "R:\".
I0180 Assembling source file "rchg.asm".
I0470 Assembling program "Rchg".
I0510 Assembling program pass 1.
I0510 Assembling program pass 2.
I0530 Assembling program pass 3 - final.
I0660 32bit FLAT PE file "Rchg.exe" created, size=16732.
I0650 Program "Rchg" assembled in 3 passes with errorlevel 0.
I0750 Source "rchg" (1229 lines) assembled in 2 passes with errorlevel 0.
I0860 Listing file "rchg.asm.lst" created, size=1753.
I0980 Memory allocation 960 KB. 28249 statements assembled in 1 s.
I0990 EuroAssembler terminated with errorlevel 0.
R:\>rchg.exe
└ Hello
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.