简体   繁体   English

在 MASM32 程序集中打印 unicode 字符

[英]Printing unicode character in MASM32 assembly

I am trying to print a unicode character in MASM32 assembly, but I can't make it work.我试图在 MASM32 程序集中打印一个 unicode 字符,但我无法让它工作。 Here is a reproducible example:这是一个可重现的例子:

.DATA
output      db    "%x Hello",10,0
unicode     DWORD "∟", 0

.DATA?

.CODE
start:
        push offset unicode
        push offset output
        call crt_printf
        
        invoke  ExitProcess, 0

end start

Current output: 40300a Hello当前 output: 40300a Hello

Expected output: ∟ Hello预期 output: ∟ Hello

You're using %x to print a number as hex digits, and the number you're passing is an address.您正在使用%x将数字打印为十六进制数字,而您传递的数字是一个地址。 So 0x40300a is the address of the unicode label in your .data section.所以0x40300a.data部分中unicode label 的地址。

%s should probably work, if it and the output terminal support the same encoding that your editor and assembler used.如果 %s 和 output 终端支持与您的编辑器和汇编器使用的编码相同的编码,则%s应该可以工作。 It should just copy bytes from the address you pass, until reaching a 0 , so it should Just Work for UTF-8. But not for UTF-16, if there's a 0 byte somewhere in there.它应该只从您传递的地址复制字节,直到到达0 ,因此它应该只适用于 UTF-8。但不适用于 UTF-16,如果其中某处有0字节。 %ls could work if supported, treating the arg as a wchar_t* string.如果支持, %ls可以工作,将 arg 视为wchar_t*字符串。

If you wanted to pass a word or dword as a wide-character for %lc , you'd push dword ptr [unicode] .如果你想将一个单词或双字作为宽字符传递给%lc ,你push dword ptr [unicode] Maybe.或许。 In ISO C99 and C++, %lc takes an int arg, and prints it like it would a wchar_t[2] string (I think with the 2nd element being a terminating 0, if that's what cppreference means ).在 ISO C99 和 C++ 中, %lc接受一个int arg,并像wchar_t[2]字符串一样打印它(我认为第二个元素是终止 0,如果那是 cppreference 的意思)。 But Microsoft has persistently declined to support standard C and C++ features, especially around printf, so who knows what crt_printf supports.但是微软一直拒绝支持标准的 C 和 C++ 特性,尤其是 printf 附近,所以谁知道crt_printf支持什么。

The desired character seems to be └ Box Drawings Light Up and Right想要的角色好像是└ 方块图 Light Up and Right
which has encoding 0xE29494 in UTF-8 alias 0x1425 in UTF-16LE.它在0x1425中编码为0xE29494 ,在 UTF-16LE 中编码为 0x1425。
I don't know how did your texteditor encoded the source line unicode DWORD "∟", 0 but the (unreproducible for me) function crt_printf seems to not cope with it.我不知道您的文本编辑器是如何对源代码行unicode DWORD "∟", 0进行编码的,但是(对我来说无法重现)function crt_printf似乎无法应对。

MS Windows works with UTF-16LE, you'll need WinAPI function WriteConsoleW and define the lpBuffer as MS Windows 使用 UTF-16LE,您需要 WinAPI function WriteConsoleW并将 lpBuffer 定义为

    unicode db 14h,25h
    output  dw " ","H","e","l","l","o",10,0
nNumberOfCharsToWrite EQU ($-unicode)/2  ; Number of 16bit characters.

Questions related to Microsoft Macro Assembler have a dedicated MASM Forum here .与 Microsoft Macro Assembler 相关的问题在此处有专门的 MASM 论坛

Printing Unicode strings might be easier in other assemblers, for instance with macro StdOutput in €ASM在其他汇编程序中打印 Unicode 字符串可能更容易,例如在 €ASM 中使用宏StdOutput

rchg PROGRAM Format=PE,Entry=start
       INCLUDE winapi.htm
[.data]       
Buffer DB 14h,25h
       DU " Hello",10,0
[.text]       
start: StdOutput Buffer,Console=yes,Unicode=yes
       WinAPI ExitProcess, 0
     ENDPROGRAM

The previous source compiles and works fine:以前的源代码编译并工作正常:

R:\>euroasm.exe rchg.asm
I0010 EuroAssembler version 20191104 started.
I0020 Current directory is "R:\".
I0180 Assembling source file "rchg.asm".
I0470 Assembling program "Rchg". 
I0510 Assembling program pass 1. 
I0510 Assembling program pass 2. 
I0530 Assembling program pass 3 - final.
I0660 32bit FLAT PE file "Rchg.exe" created, size=16732. 
I0650 Program "Rchg" assembled in 3 passes with errorlevel 0. 
I0750 Source "rchg" (1229 lines) assembled in 2 passes with errorlevel 0.
I0860 Listing file "rchg.asm.lst" created, size=1753.
I0980 Memory allocation 960 KB. 28249 statements assembled in 1 s.
I0990 EuroAssembler terminated with errorlevel 0.

R:\>rchg.exe
└ Hello

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM