简体   繁体   English

微软对lstrcmpi和Unicode字符的实现

[英]Microsoft's implementation of lstrcmpi and Unicode characters

I'm trying to understand whether what I'm seeing is a bug, or some accepted behaviour of the Microsoft's lstrcmpi function? 我想了解我所看到的是错误还是Microsoft的lstrcmpi函数的某些可接受的行为?

I can illustrate it with the code: 我可以用代码说明一下:

WCHAR buff1[] = L"abc ";
WCHAR buff2[] = L"abc ";
buff1[3] = 0xFFFF;
buff2[3] = 0x0;
int res = lstrcmpi(buff1, buff2);
//res is 0 or equality!

EDIT: Addition for the comment below: 编辑:以下注释的补充:

在此处输入图片说明

lstrcmpi calls CompareString with the current locale ( from thread or user ) and returns "a linguistically appropriate result". lstrcmpi使用当前语言环境( 来自线程或用户 )调用CompareString并返回“语言上合适的结果”。

From Michael Kaplans blog : 来自Michael Kaplans博客

... Now if the functions were named lstrcoll and lstrcolli then perhaps the function would not be so commonly misused ...现在,如果函数被命名为lstrcoll和lstrcolli,那么也许不会那么普遍地使用该函数

and :

Remember that when checking for equality, especially on an item like a registry value where OS semantics are involved, the best answer is CompareStringOrdinal, with a fallback to RtlCompareUnicodeString or even better RtlEqualUnicodeString or if you absolutely must wcsicmp (with awareness that there is one character it can be wrong about) for anything that has to run pre-Vista. 请记住,当检查是否相等时,尤其是在涉及OS语义的注册表值之类的项目上,最佳答案是CompareStringOrdinal,回退到RtlCompareUnicodeString甚至更好的RtlEqualUnicodeString,或者如果您绝对必须wcsicmp(要意识到有一个字符)对于任何必须在Vista之前运行的东西可能是错误的。

and finally : 最后

Because if you are calling lstrcmpi for appropriate reasons (ie you wanted to get linguistically meaningful results, say in the sorting of a list in a user interface) but you wanted to have behavior that did not vary with different locales, then CompareString with LOCALE_INVARIANT is a good answer. 因为如果出于适当的原因调用lstrcmpi(即,您希望获得语言上有意义的结果,例如在用户界面中对列表进行排序),但是您希望行为不会因不同的语言环境而异,则带有LOCALE_INVARIANT的CompareString是一个很好的答案。

But if you wanted almost anything else, including all of the non-linguistic purposes hinted at earlier, then CompareStringOrdinal or RtlCompareUnicodeString is a much better choice. 但是,如果您想要几乎所有其他东西,包括前面提到的所有非语言目的,那么CompareStringOrdinal或RtlCompareUnicodeString是一个更好的选择。

How it handles non-characters has actually changed over time . 实际上,它处理非字符的方式已经随着时间而改变

The Unicode FFFF character is a noncharacter in the Unicode spec, so it is probably being ignored during the string comparison. Unicode FFFF字符是Unicode规范中的非字符,因此在字符串比较期间可能会忽略它。 This results in both strings being equal. 这导致两个字符串相等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM