简体   繁体   中英

StringComparer.InvariantCultureIgnoreCase Equals vs GetHashCode disagree for equal objects

StringComparer.InvariantCultureIgnoreCase Equals returns true for "" vs "\\0", but GetHashCode return different values for the two strings. Is this a bug?

var sc = StringComparer.InvariantCultureIgnoreCase;
string s1 = "";
string s2 = "\0";
Console.WriteLine( sc.Equals(s1, s2)  );
Console.WriteLine( sc.GetHashCode(s1) );
Console.WriteLine( sc.GetHashCode(s2) );

returns

True
0
-1644535362

I thought that GetHashCode should return the same value for 'equal' strings, so is this a bug?

Those 2 strings are not bitwise equals. They are 2 different lengths. Hence, the hash code algorithm is being reasonable here.

The string comparison algorithm must be ignoring \\0 when doing its comparisons. I looked into the source : It is doing some sort of add/subtract comparison algorithm to find out if things are different.

GetHashCode only implies that values may be equal, not necessarily they are. The inverse is not always true, you can override the == operator or .Equals for any type and have it produce true when the hash codes don't agree.

Here's GetHashCode . It looks like it's using the raw bytes for calculation.

You may have stumbled on an obscure bug in the .NET libraries, but I'm guessing that this is an edge case.

Then again - you are using a string comparer. Not the string.Equals method or string.GetHashCode . Notice that Console.WriteLine("\\0".Equals("")); produces false .

If you want a string comparer that would have it agree, I believe StringComparer.OrdinalIgnoreCase would do the trick since it looks at each character separately, instead of collating on the values. It looks like both just funnel to the same character comparison method.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM