简体   繁体   中英

Implementing Equals and GetHashCode - an easier way

I have a tree of objects (DTOs), where one object references other objects and so on:

class Person
{
    public int Id { get; }
    public Address Address { get; }
    // Several other properties
}

public Address
{
    public int Id { get; }
    public Location Location { get; }
    // Several other properties
}

These objects can be quite complex and have many other properties.

In my app, a Person with same Id could be in two storages, local storage in the app and coming from backend. I need to merge the online Person with local Person in a specific way, so for this I need to first know if the online Person is same with the one stored locally (in other words if local Person hasn't been updated by the app).

In order to use LINQ's Except, I know I need to implement Equatable<T> and the usual way I've seen it is like this:

class Person : IEquatable<Person>
{
    public int Id { get; }
    public Address Address { get; }

    public override bool Equals(object obj)
    {
        return Equals(obj as Person);
    }

    public bool Equals(Person other)
    {
        return other != null &&
               Id == other.Id &&
               Address.Equals(other.Address);
    }

    public override int GetHashCode()
    {
        var hashCode = -306707981;
        hashCode = hashCode * -1521134295 + Id.GetHashCode();
        hashCode = hashCode * -1521134295 + (Address != null ? Address.GetHashCode() : 0);
        return hashCode;
    }

To me this sounds complicated and hard to maintain, it's easy to forget to update Equals and GetHashCode when properties change. Depending on the objects, it can also be a bit computational expensive.

Wouldn't the following be a simpler and much effective way of implementing Equals and GethashCode ?

class Person : IEquatable<Person>
{
    public int Id { get; }
    public Address Address { get; private set; }
    public DateTime UpdatedAt { get; private set; }

    public void SetAdress(Address address)
    {
        Address = address;
        UpdatedAt = DateTime.Now;
    }

    public override bool Equals(object obj)
    {
        return Equals(obj as Person);
    }

    public bool Equals(Person other)
    {
        return other != null &&
               Id == other.Id &&
               UpdatedAt.Ticks == other.UpdatedAt.Ticks;
    }

    public override int GetHashCode()
    {
        var hashCode = -306707981;
        hashCode = hashCode * -1521134295 + Id.GetHashCode();
        hashCode = hashCode * -1521134295 + UpdatedAt.Ticks.GetHashCode();
        return hashCode;
    }
}

My idea is whenever the object changes, there's a timestamp. This timestamp gets saved along with the object. I am thinking to use this field as a concurrency token in storage too.

Since resolution of DateTime could be an issue, instead of using time, I'm thinking a Guid is also a good option instead of DateTime. There wouldn't be too many objects, so the uniqueness of Guid shouldn't be an issue.

Do you see a problem with this approach?

Like I said above, I think it would be much easier to implement and faster to run than having Equals and GetHashCode go over all the properties.

Update : The more I think about it, I tend to feel that having Equals and GetHashCode implemented on the class is not a good approach. I think it would be better to implement a specialized IEqualityComparer<Person> which compares Person s in a specific way and pass it to LINQ's methods.

The reason for this is because, like in the comments and answer, a Person could be used in different ways.

This would give you false negative equality if two objects have the same properties but were created at different times, and it would give you false positive equality if two objects were created with different properties but right after each other (the clock is not that accurate).

For LINQ Except , it's really GetHashCode you need to implement, and this should be using the hash code of all of the properties.

Ideally, they should also be immutable (remove the private setter) so that one object has the same hash code for its whole life.

Your GetHashCode should also be unchecked .

Alternatively, you could use Except with a custom comparer.

Really lazy version for implementing GetHashCode / Equals using value-tuples (which don't allocate for this):

class Person : IEquatable<Person>
{
    public int Id { get; }
    public Address Address { get; }
    public Person(int id, Address address) => (Id, Address) = (id, address);

    public override bool Equals(object obj) => Equals(obj as Person);

    public bool Equals(Person other) => other != null
             && (Id, Address).Equals((other.Id,other.Address));

    public override int GetHashCode() => (Id, Address).GetHashCode();
}

Following is a LinqPad sketch, you could start from. It has all the tools you could use to tailor it to your needs. Of course, this is just a concept, and not all aspects are completely elaborated.

As you can see, there is an Include attribute that can be applied to the backing fields you want to include in the hash.

void Main()
{
    var o1 = new C { Interesting = "Whatever", NotSoInterresting = "Blah.." };
    var o2 = new C { Interesting = "Whatever", NotSoInterresting = "Blah-blah.." }; 

    (o1 == o2).Dump("o1 == o2"); // False
    (o2 == o1).Dump("o2 == o1"); // False

    var o3 = o1.Clone();
    (o3 == o1).Dump("o3 == o1"); // True
    (object.ReferenceEquals(o1, o3)).Dump("R(o3) == R(o2)"); // False

    o3.NotSoInterresting = "Changed!";
    (o1 == o3).Dump("o1 == C(o3)"); // True

    o3.Interesting = "Changed!";
    (o1 == o3).Dump("o1 == C(o3)"); // False
}

[AttributeUsage(AttributeTargets.Field)]
public class IncludeAttribute : Attribute { }

public static class ObjectExtensions
{
    public static int GetHash(this object obj) => obj?.GetHashCode() ?? 1;

    public static int CalculateHashFromFields(this object obj)
    {
        var fields = obj.GetType()
            .GetFields(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.DeclaredOnly /*or not*/)
            .Where(f => f.CustomAttributes.Any(x => x.AttributeType.Equals(typeof(IncludeAttribute))));

        var result = 1;

        unchecked
        {
            foreach(var f in fields) result *= f.GetValue(obj).GetHash();
        }

        return result;
    }
}

public partial class C
{
    [Include]
    private int id;
    public int Id { get => id; private set { id = value; UpdateHash(); } }

    [Include]
    private string interesting;
    public string Interesting { get => interesting; set { interesting = value; UpdateHash(); } }

    public string NotSoInterresting { get; set; }
}

public partial class C: IEquatable<C>
{
    public C Clone() => new C { Id = this.Id, Interesting = this.Interesting, NotSoInterresting = this.NotSoInterresting };

    private static int _id = 1; // Some persistence is required instead

    public C()
    {
        Id = _id++;
    }

    private int hash;

    private void UpdateHash() => hash = this.CalculateHashFromFields();

    public override bool Equals(object obj)
    {
        return Equals(obj as C);
    }

    public bool Equals(C other) => this.hash == other.hash;

    public override int GetHashCode() => hash;

    public static bool operator ==(C obj1, C obj2) => obj1.Equals(obj2);

    public static bool operator !=(C obj1, C obj2) => !obj1.Equals(obj2);
}

[Update 18.06.17]

Updated version:

void Main()
{
    var o1 = new C { Interesting = "Whatever", NotSoInterresting = "Blah.." };
    var o2 = new C { Interesting = "Whatever", NotSoInterresting = "Blah-blah.." }; 

    (o1 == o2).Dump("o1 == o2"); // False
    (o2 == o1).Dump("o2 == o1"); // False

    var o3 = o1.Clone();
    (o3 == o1).Dump("o3 == o1"); // True
    (object.ReferenceEquals(o1, o3)).Dump("R(o3) == R(o2)"); // False

    o3.NotSoInterresting = "Changed!";
    (o1 == o3).Dump("o1 == C(o3)"); // True

    o3.Interesting = "Changed!";
    (o1 == o3).Dump("o1 == C(o3)"); // False

    C o4 = null;
    (null == o4).Dump("o4 == null"); // True
}

[AttributeUsage(AttributeTargets.Field)]
public class IncludeAttribute : Attribute { }

public static class ObjectExtensions
{
    public static int GetHash(this object obj) => obj?.GetHashCode() ?? 1;
}

public abstract class EquatableBase : IEquatable<EquatableBase>
{
    private static FieldInfo[] fields = null;

    private void PrepareFields()
    {
        fields = this.GetType()
            .GetFields(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.DeclaredOnly /*or not*/)
            .Where(f => f.CustomAttributes.Any(x => x.AttributeType.Equals(typeof(IncludeAttribute))))
            .ToArray();
    }

    private int CalculateHashFromProperties()
    {
        if (fields == null) PrepareFields();

        var result = 1;

        unchecked
        {
            foreach (var f in fields) result ^= f.GetValue(this).GetHash();
        }

        return result;
    }

    private bool CheckDeepEqualityTo(EquatableBase other)
    {
        if (ReferenceEquals(other, null) || other.GetType() != GetType()) return false;
        if (fields == null) PrepareFields();

        var result = true;
        for(int i = 0; i < fields.Length && result; i++)
        {
            var field = fields[i];
            result &= field.GetValue(this).Equals(field.GetValue(other));
        }
        return result;
    }

    private int hash;

    protected int UpdateHash() => hash = this.CalculateHashFromProperties();

    protected void InvalidateHash() => hash = 0;

    public override bool Equals(object obj) => Equals(obj as EquatableBase);

    public bool Equals(EquatableBase other) => object.ReferenceEquals(this, other) || this.CheckDeepEqualityTo(other);

    public override int GetHashCode() => hash == 0 ? UpdateHash() : hash;

    public static bool operator ==(EquatableBase obj1, EquatableBase obj2) => ReferenceEquals(obj1, obj2) || obj1?.CheckDeepEqualityTo(obj2) == true;

    public static bool operator !=(EquatableBase obj1, EquatableBase obj2) => !(obj1 == obj2);
}

public partial class C: EquatableBase
{
    private static int _id = 1; // Some persistence is required instead

    public C()
    {
        Id = _id++;
    }

    public C Clone() => new C { Id = this.Id, Interesting = this.Interesting, NotSoInterresting = this.NotSoInterresting };

    [Include]
    private int id;
    public int Id { get => id; private set { id = value; InvalidateHash(); } }

    [Include]
    private string interesting;
    public string Interesting { get => interesting; set { interesting = value; InvalidateHash(); } }

    public string NotSoInterresting { get; set; }
}

One still can't get rid of calling something in the setter (and there certainly is still place for optimization), but these are the improvements so har:

  • Reusable base class instead of partial
  • The fields of interest are cached per type
  • Hash is recalculated only at first request after it was invalidated, and invalidation is cheep
  • Deep equality check based on the fields of interest, instead of just comparing the hashes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM