简体   繁体   中英

Object.Equals: everything is equal by default

While reading Jeffrey Richter's CLR via C# 4th edition (Microsoft Press), the author at one point states that while Object.Equals currently checks for identity equality, Microsoft should have implemented the method like this :

public class Object {
    public virtual Boolean Equals(Object obj) {
        // The given object to compare to can't be null
        if (obj == null) return false;

        // If objects are different types, they can't be equal.
        if (this.GetType() != obj.GetType()) return false;

        // If objects are same type, return true if all of their fields match
        // Because System.Object defines no fields, the fields match
        return true;
    }
}

This strikes me as very odd: every non-null object of the same type would be equal by default? So unless overridden: all instances of a type are equal (eg all your locking objects are equal), and return the same hash code. And assuming that the == on Object still checks reference equality, this would mean that (a == b) != a.Equals(b) which would also be strange.

I think the idea of things being equal if it is the exact same thing (identity) is a better idea than just making everything equal unless overridden. But this is a well known book's 4th edition published by Microsoft, so there must be some merit to this idea. I read the rest of the text but could not help but wonder: Why would the author suggest this? What am I missing here? What is the great advantage of Richter's implementation over the current Object.Equals implementation?

The current default Equals() does what is known as a shallow compare (or reference compare), and then doesn't check any further if the references differ.

I think this is perfectly acceptable for a base implementation . I certainly wouldn't think that it is wrong or incomplete.

Richter's example 1 which you quote is also perfectly legitimate for the base System.Object . The issue with his implementation is that it arguably should be declared abstract 2 - with his method you will end up with an unreliable Equals() on derived objects if you do not override it (because Equals() is supposed to do a deep compare). Having to override this method on all derived objects would be a lot of work, therefore the Microsoft way is better as a default . So in essense you are correct: Richter's example is odd - it is better to default to not equal rather then the other way round (defaulting to true would lead to some rather interesting behavior if people forgot to override it).

(Just for easy reference, here is the default implementation as published in the book)

在此输入图像描述



1: Richter is a smart man who knows his stuff and I wouldn't generally argue with anything he says. You have to understand that the MS engineers would have had to think long and hard about a lot of things, knowing that they didn't have the flexibility of being able to get it wrong and then just fix stuff later. No matter how right they are, people will always second guess them at a later date, and offer alternative opinions. That doesn't mean the original is wrong or the alternative is wrong - it simply means there was an alternative.

2: Which of course means that there would be no base implementation, which is good because it would have been unreliable.

Jeffery Richter is talking about Value equality over Identity equality.

Specifically you ask:

So unless overridden: all instances of a type are equal?

The answer is Yes , But... As in, Yes, But it is (almost) always supposed to be overridden.

Thus, for most Classes it should be overridden to do a attribute-by-attribute comparison to determine equality. For some other classes that are truly identity-based (like locks) it should be overridden to use the same technique as it uses today.

The key though is that it must be overridden in almost every case, and this alone is sufficiently difficult, clumsy and mistake-prone that it is probably why Microsoft did not use this approach.


What is the advantage of Value-Equality over Identity-Equality? It's that if two different objects have the same values/contents, then they can be considered "equal" for purposes of comparison in cases like the Keys of a Dictionary object.

Or consider the matter of strings in .Net, which are actually objects, but get treated a lot like values at higher-levels (especially in VB.net). This presents a problem when you want to compare two strings for equality, because 99% of the time you really do not care if they are different object instances , you only really care if they contain the same text. So .Net has to make sure that that is how string comparison actually works, even though they are really objects.

If one is asked to make a list of all identifiably-distinct objects of arbitrary types, and is not given any indication of what the objects are or what they will be used for, the only universally-applicable means of testing whether two references should be considered as pointing to identifiably-distinct objects is Object.Equals(Object) . Two references X and Y should be considered identifiably-distinct if changing one or more references that presently point to X so that they instead point to Y would likely alter program behavior.

For example, if two instances of string both contain the entire text of War and Peace , punctuated and formatted identically, one could likely replace some or all references to the first with references to the second, or vice versa, with little or no effect on program execution beyond the fact that a comparison between two references which point to the same instance may be found to hold identical text much more quickly than could two references which point to different strings that contain identical characters.

In most cases, objects which exist to hold immutable data should be considered to be identical if the data they hold is identical. Objects which exist to hold mutable data, or which exist to serve as identity tokens, should generally be considered distinct from each other. Given that one can define a custom EqualityComparer which will regard as equivalent objects which are not totally equivalent (eg a case-insensitive string comparer), and given that code which needs some definition of equivalence which is broader than strict equivalence should generally know what types it is working with and what definition of equivalence is suitable, it is generally better to have Object.Equals report objects as being different unless they are designed to be substitutable (as would be, eg, strings).

To use a real-world analogy, suppose one is given two pieces of paper, each with a Vehicle Identification Number written on it, and is asked if the car identified by the first piece of paper is the same as the car identified by the second. If the two slips of paper have the same VIN, then clearly the car identified by the first is the same as the one identified by the second. If they have different VINs, however, excluding any weird possibility of a car having more than one VIN, then they identify different cars. Even if the cars have the same make and model, options packages, paint scheme, etc. they would still be different cars. A person who bought one would not be entitled to arbitrarily start using the other instead. It may sometimes be useful to know whether two cars presently have the same options packages, etc. but if that's what one wants to know, that's what one should ask.

Guess: the current behavior of Object.Equals is not what most people consider to be "equal".

The main (only?) reason of this method to exist is to allow searching for items in collections by pretending to be "==" implementation. So in most practical cases this implementation behaves unexpectedly (except for the case when you want to find if particular instance is in the collection already) and you force to provide you custom comparison functions...

Likely it is method of Object because for technical reasons. Ie for Array/Dictionary it may be faster to assume all objects have Equal / GetHash instead of checking something on object to enable "Find" functionality.

Arguably it should not be on Object at all and instead just require classes that can be stored in collections to implement some form of IComparable interface.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM