简体   繁体   English

wchar_t 无符号或有符号

[英]wchar_t is unsigned or signed

In this link unsigned wchar_t is typedef ed as WCHAR .在此链接中, unsigned wchar_ttypedefWCHAR But I cant find this kind of typedef in my SDK winnt.h or mingw winnt.h .但我在 SDK winnt.h或 mingw winnt.h中找不到这种 typedef。

wchar_t is signed or unsigned? wchar_t是有符号的还是无符号的?

I am using WINAPIs in C language.我在 C 语言中使用 WINAPI。

The signedness of wchar_t is unspecified. wchar_t的符号是未指定的。 The standard only says (3.9.1/5):该标准只说(3.9.1/5):

Type wchar_t shall have the same size, signedness, and alignment requirements (3.11) as one of the other integral types, called its underlying type . wchar_t类型应具有与其他整数类型之一相同的大小、符号和对齐要求 (3.11),称为其基础类型

(By contrast, the types char16_t and char32_t are expressly unsigned.) (相比之下, char16_tchar32_t类型是明确无符号的。)

Be aware the type will vary in length by platform.请注意,类型的长度会因平台而异。

Windows uses UTF-16 and a wchar_t is 2 bytes. Windows 使用 UTF-16 并且 wchar_t 是 2 个字节。 Linux uses a 4 byte wchar_t. Linux 使用 4 字节的 wchar_t。

The standard may not specify whether wchar_t is signed or unsigned, but Microsoft does.该标准可能没有指定wchar_t是有符号还是无符号,但 Microsoft 有。 Even if your non-Microsoft compiler disagrees, the Windows API will be using this definition from /Zc:wchar_t (wchar_t Is Native Type) :即使您的非 Microsoft 编译器不同意,Windows API 也会使用/Zc:wchar_t (wchar_t Is Native Type)中的此定义:

Microsoft implements wchar_t as a two-byte unsigned value. Microsoft 将wchar_t实现为两字节无符号值。 It maps to the Microsoft-specific native type __wchar_t .它映射到 Microsoft 特定的本机类型__wchar_t

Type WCHAR, not wchar_t, is defined on MSDN as the following:类型 WCHAR,而不是 wchar_t,在 MSDN 上定义如下:

   #if !defined(_NATIVE_WCHAR_T_DEFINED)
    typedef unsigned short WCHAR;
    #else
    typedef wchar_t WCHAR;
    #endif

https://docs.microsoft.com/en-us/windows/win32/extensible-storage-engine/wchar https://docs.microsoft.com/en-us/windows/win32/extensible-storage-engine/wchar

So you could conclude that its defined as unsigned on windows?所以你可以得出结论,它在 windows 上定义为无符号?

I just tested on several platforms, with no optimisation.我只是在几个平台上进行了测试,没有进行优化。

1) MinGW (32-bit) + gcc 3.4.4:
---- snip ----
#include<stdio.h>
#include<wchar.h>
const wchar_t BOM = 0xFEFF;
int main(void)
{
    int c = BOM;
    printf("0x%08X\n", c+0x1000);
    return 0;
}
---- snip ----

It prints 0x00010EFF .它打印0x00010EFF wchar_t is unsigned. wchar_t是无符号的。 Corresponding assembly code says movzwl _BOM, %eax .相应的汇编代码说movzwl _BOM, %eax Not movSwl , but movZwl .不是movSwl ,而是movZwl

2) FreeBSD 11.2 (64-bit) + clang 6.0.0:
---- snip ----
#include<stdio.h>
#include<wchar.h>
const wchar_t INVERTED_BOM = 0xFFFE0000;
int main(void)
{
     long long c = INVERTED_BOM;
     printf("0x%016llX\n", c+0x10000000LL);
     return 0;
}
---- snip ----

It prints 0x000000000EFF0000 .它打印0x000000000EFF0000 wchar_t is signed. wchar_t已签名。 Corresponfing assembly code says, movq $-131072, -16(%rbp) .对应的汇编代码说, movq $-131072, -16(%rbp) The 32-bit 0xFFFE0000 is promoted to 64-bit signed -131072 . 32 位0xFFFE0000提升为 64 位有符号-131072

3) Same code as 2), on RedHat (version unknown) + gcc 4.4.7: It again prints 0x000000000EFF0000 . 3) 与 2) 相同的代码,在 RedHat(版本未知)+ gcc 4.4.7 上:它再次打印0x000000000EFF0000 wchar_t is signed. wchar_t已签名。

I tested neither the printf 's implementation nor WinAPI's WCHAR definition, but the behaviors of compiler-builtin wchar_t type (no specification about its signedness on any header file) and C-to-ASM compiler engine.我既没有测试printf的实现,也没有测试 WinAPI 的WCHAR定义,而是编译器内置wchar_t类型的行为(没有关于其在任何头文件上的签名的规范)和 C-to-ASM 编译器引擎。

Note that the compilers on 1) and 3) are provided by the same vendor, namely the GNU Project.请注意,1) 和 3) 上的编译器由同一供应商提供,即 GNU 项目。 The answer definitely depends on platforms.答案肯定取决于平台。 (Would somebody test on Visual C++?) (有人会在 Visual C++ 上进行测试吗?)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM