简体   繁体   English

Visual C ++,wchar_t *命令行参数无法比较?

[英]Visual C++, wchar_t* command-line arguments cannot be compared?

I am having a little bit of trouble trying to implement a certain assignment based on which string argv[1] equals. 我在尝试根据字符串argv [1]等于实现某个赋值时遇到了一些麻烦。

int _tmain(int argc, _TCHAR* argv[]) //wchar_t
{   
    if (argc != 2)
        exit(1);

    if (argv[1] == L"-foo")
        printf("Success!\n");

    wprintf(argv[1]);
    printf("\n");

    system("pause");
    return 0;
}

If I run the executable with the argument "-foo", I receive the following output: 如果我使用参数“-foo”运行可执行文件,我会收到以下输出:

-foo

It should be: 它应该是:

Success!
-foo

The string is exactly how I want it to be, but the if-condition remains to be false. 字符串正是我想要的,但if条件仍然是假的。 Are wchar_t strings simply not comparable using the == operator? wchar_t字符串是否使用==运算符无法比较? If so, how do I compare them properly? 如果是这样,我该如何正确比较它们?

Preliminary Note : Unicode and Unicode character in this answer, given the context of the question itself, refers to the UCS-2 (up to XP) and UTF-16 (starting with XP) encodings, used interchangeably with wide character , wchar_t , WCHAR and other terms in the context of the Win32 API. 初步注意 :本答案中的UnicodeUnicode字符 ,根据问题本身的上下文,是指UCS-2(最高为XP)和UTF-16(以XP开头)编码,可与宽字符 wchar_tWCHAR互换使用和Win32 API上下文中的其他术语。 The Unicode standards offer multiple encodings such as UTF-8, UTF-16 and UTF-32 to encode the same number of characters - different incarnations of the standard have a different scope. Unicode标准提供多种编码,如UTF-8,UTF-16和UTF-32,以编码相同数量的字符 - 标准的不同版本具有不同的范围。 Surrogate code points are used to escape from the Basic Multilingual Plane (BMP), roughly the first 64K code points, and thus encode more than could be encoded with 16bit characters and one character per code-point. 代理代码点用于从基本多语言平面(BMP)中逃逸,大致是第一个64K代码点,因此编码比用16位字符和每个代码点一个字符编码更多。 The surrogate extensions were developed for the Unicode 2.0 standard, which was passed in the year NT 4.0 was released, but some years after the first "Unicode-capable" version of Windows, NT 3.51, got released. 代理扩展是为Unicode 2.0标准开发的,该标准在NT 4.0发布的年份通过,但是第一个“支持Unicode的”版本的NT NT 3.51发布之后几年。 That original standard didn't account for more characters than the BMP and that is why Unicode character or wide character are even now used synonymous with Unicode in the Win32 API context, although this is inaccurate. 原始标准没有考虑比BMP更多的字符,这就是为什么Unicode字符宽字符现在甚至在Win32 API上下文中与Unicode同义,尽管这是不准确的。

To answer the underlying question you raised: 要回答你提出的基本问题:

Are wchar_t strings simply not comparable using the "==" operator? wchar_t字符串是否使用“==”运算符无法比较?

No they aren't, neither are "ANSI" strings, ie using the char type as the basis. 不,它们不是,也不是“ANSI”字符串,即使用char类型作为基础。 Remember, a C string (both wchar_t and char based) is a pointer . 请记住,C字符串(基于wchar_tchar都是指针 This means with == you were comparing two pointer values that were definitely not equal. 这意味着==你正在比较两个绝对不相等的指针值。 One, after all, is a literal string (ie inside your program image) while the other is allocated somewhere on the heap. 毕竟,一个是文字字符串(即在程序图像中),而另一个是在堆上的某个地方分配的。 So they are definitely two different entities. 所以他们肯定是两个不同的实体。

If you wanted to use the == you would have to use a language such as C++ with the STL class std::string (or std::basic_string<_TCHAR> ) or (on Windows) the ATL class CString (or rather CStringT ). 如果你想使用==你必须使用C ++等语言与STL类std::string (或std::basic_string<_TCHAR> )或(在Windows上)ATL类CString (或更确切地说是CStringT ) 。 These classes are sometimes referred to as smart string classes and use the C++ facility of overriding the operator==() . 这些类有时被称为智能字符串类,并使用覆盖operator==()的C ++工具。 However, you should keep in mind that semantics differ depending on implementation, so not every smart string class will compare the string contents. 但是,您应该记住语义因实现而异,因此并非每个智能字符串类都会比较字符串内容。 Some might merely compare the equality of this (ie is it the same instance), while others may compare the string contents case-insensitive or case-sensitive at their discretion. 有些人可能仅仅比较平等this (即是相同的实例),而另一些可能比较字符串内容不区分大小写或者区分大小写自行决定。

To compare C strings you have the following functions available for your use-case: 要比较C字符串,您可以使用以下函数用于您的用例:

  • For "ANSI" character ( char ) strings: strcmp , _stricmp (and the "counted" variants: _strncmp , _strnicmp ... there are more) 对于“ANSI”字符( char )字符串: strcmp_stricmp (以及“计数”变体: _strncmp_strnicmp ......还有更多)
  • For Unicode character ( wchar_t ) strings: wcscmp , _wcsicmp (and the "counted" variants: _wcsncmp , _wcsnicmp ... there are more) 对于Unicode字符( wchar_t )字符串: wcscmp_wcsicmp (以及“计数”变体: _wcsncmp_wcsnicmp ......还有更多)
  • For the variable character"type" ( TCHAR ) strings: _tcscmp , _tcsicmp (and the "counted" variants: _tcsncmp , _tcsnicmp ... there are more) 对于变量字符“type”( TCHAR )字符串: _tcscmp_tcsicmp (和“计数”变体: _tcsncmp_tcsnicmp ......还有更多)

You can remember these prefixes: 你可以记住这些前缀:

  • str -> string str - > string
  • wcs -> wide character string wcs - >宽字符串
  • tcs -> T character string tcs - > T字符串

Side note: with #include <tchar.h> and windows.h the macros TEXT and _T are equivalent and used to declare a string literal that will either be "ANSI" or Unicode depending on the defines at build-time. 附注:使用#include <tchar.h>windows.h ,宏TEXT_T是等效的,用于声明字符串文字,根据构建时的定义,它将是“ANSI”或Unicode。 The same holds for _TCHAR and TCHAR apparently, whereas the latter appears to be favored in the Win32 API context. 这同样适用于_TCHARTCHAR显然,而后者出现在Win32 API的上下文受到青睐。

So a Unicode build will expand _T("something") to L"something" , while the "ANSI" build will expand it to "something" . 因此,Unicode构建将_T("something")扩展为L"something" ,而“ANSI”构建则将其扩展为"something"

As to TCHAR, consider reading through the arguments put forth in: Is TCHAR still relevant? 至于TCHAR,请考虑阅读以下论点: TCHAR是否仍然相关? (pointed out by rubenvb ) There are valid points for and against TCHAR / _TCHAR use and you should make a decision and stick with it - ie be consistent . (通过指出rubenvb )有支持和反对的有效点TCHAR / _TCHAR使用,你应该做出决定,并坚持使用它-即是一致的

没关系,明白了。

if (wcscmp(argv[1], L"-foo") == 0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM