简体   繁体   中英

How IEnumerable.Except() works?

I'm trying to exclude entities to be added to database if they already exist there. So I decided newBillInstances.Except(dbContext.BillInstances) would be best approach for that. However it doesn't work at all (no entities are excluded) though for List<string> it works perfectly.

I read this discussion and actual decription of .Except() in MSDN . It states the class to be used in .Except() should implement IEqualityComparer<T> to use default comparer.

Actually the MSDN article doesn't fully describe process of comparison of two instances. I still don't get why both Equals() and GetHashObject() have to be overridden.

I have implemented IEqualityComparer<BillInstance> interface and put break points in boths methods, but while calling .Except(IEnumerable) it's not used. Only when I changed to .Except(IEnumerable, new BillInstanceComparer()) I've cough break in GetHashCode() but no breaks where in Equals() .

Then I have implemented IEqualityComparer<BillInstance> right in BillInstance class and expected it would be used while using .Except(IEnumerable) but breaks weren't hit in both methods.

So I've got two questions:

  1. What should be done to use .Except(IEnumerable) ?
  2. Why Equals() isn't used at all? Is it used only in case hash codes of two instances are same?

Because the Equals() is used only if two objects have the same GetHashCode() . If there are no objects that have the same GetHashCode() then there is no chance of using the Equals() .

Internally the Except() uses a Set<> (you can see it here ), that is an internal class that you should consider to be equivalent to HashSet<> . This class uses the hash of the object to "index" them, then uses the Equals() to check if two objects that have the same hash are the same or different-but-with-the-same-hash.

Link to other relevant answer: https://stackoverflow.com/a/371348/613130

Somewhere in the code a set or a map/dictionary is hidden.

These guys typically contains a number of buckets which grows with the number of elements stored in the set. An element is partitioned into buckets based on the hash code and the actual identity comparison within the bucket is done using equals.

So the hash code is used to find the correct bucket (why GetHashCode is needed) whereupon equals is used to compare it to other elements in the buckets.

That's why you need to implement both.

Ok, from the IEnumerable source ( thanks to m0sa ) I've understood internals of calling Except(IEnumerable) :

  1. enumerable1.Except(enumerable2) calls ExceptIterator(enumerable1, enumerable2, null) where null is supposed to be an instance of IEquitableComparer .

  2. ExceptIterator() creates an instance of internal class Set passing null as comparer.

  3. Since comparer is null the property EqualityComparer<TElement>.Default is used.

  4. Default property creates a comparer for TElement unless it's already created by calling CreateComparer() . Specifically 2 points were interesting for me:

    • If TElement implements IEquatable interface, then as far as I understood some generic comparer for IEquatable is created. I believe it would use then IEquatable.GetHashCode() and IEquatable.Equals() .

    • For general cases (not type of byte, not implementing IEquatable, not Nullable, not enum) ObjectEqualityComparer instance is returned. ObjectEqualityComparer.GetHashCode() and ObjectEqualityComparer.Equals() generally call corresponding methods of the TElement .

So this gave me understanding for my case (each instance of BillInstance is generally immutable) it should be sufficient to override Object.GetHashCode() and Object.Equals() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM