简体   繁体   中英

Distinct() returns duplicates with a user-defined type

I'm trying to write a Linq query which returns an array of objects, with unique values in their constructors. For integer types, Distinct returns only one copy of each value, but when I try creating my list of objects, things fall apart. I suspect it's a problem with the equality operator for my class, but when I set a breakpoint, it's never hit.

Filtering out the duplicate int in a sub-expression solves the problem, and also saves me from constructing objects that will be immediately discarded, but I'm curious why this version doesn't work.

UPDATE: 11:04 PM Several folks have pointed out that MyType doesn't override GetHashCode(). I'm afraid I oversimplified the example. The original MyType does indeed implement it. I've added it below, modified only to put the hash code in a temp variable before returning it.

Running through the debugger, I see that all five invocations of GetHashCode return a different value. And since MyType only inherits from Object, this is presumably the same behavior Object would exhibit.

Would I be correct then to conclude that the hash should instead be based on the contents of Value? This was my first attempt at overriding operators, and at the time, it didn't appear that GetHashCode needed to be particularly fancy. (This is the first time one of my equality checks didn't seem to work properly.)

class Program
{
    static void Main(string[] args)
    {
        int[] list = { 1, 3, 4, 4, 5 };
        int[] list2 =
            (from value in list
             select value).Distinct().ToArray();    // One copy of each value.
        MyType[] distinct =
            (from value in list
             select new MyType(value)).Distinct().ToArray(); // Two objects created with 4.

        Array.ForEach(distinct, value => Console.WriteLine(value));
    }
}

class MyType
{
    public int Value { get; private set; }

    public MyType(int arg)
    {
        Value = arg;
    }

    public override int GetHashCode()
    {
        int retval = base.GetHashCode();
        return retval;
    }

    public override bool Equals(object obj)
    {
        if (obj == null)
            return false;

        MyType rhs = obj as MyType;
        if ((Object)rhs == null)
            return false;

        return this == rhs;
    }

    public static bool operator ==(MyType lhs, MyType rhs)
    {
        bool result;

        if ((Object)lhs != null && (Object)rhs != null)
            result = lhs.Value == rhs.Value;
        else
            result = (Object)lhs == (Object)rhs;

        return result;
    }

    public static bool operator !=(MyType lhs, MyType rhs)
    {
        return !(lhs == rhs);
    }
}

You need to override GetHashCode() in your class. GetHashCode must be implemented in tandem with Equals overloads. It is common for code to check for hashcode equality before calling Equals. That's why your Equals implementation is not getting called.

Your suspicion is correct,it is the equality which currently just checks the object references. Even your implementation does not do anything extra, change it to this:

public override bool Equals(object obj)
{
    if (obj == null)
        return false;

    MyType rhs = obj as MyType;
    if ((Object)rhs == null)
        return false;

    return this.Value == rhs.Value;
}

In you equality method you are still testing for reference equality, rather than semantic equality, eg on this line:

result = (Object)lhs == (Object)rhs

you are just comparing two object references which, even if they hold exactly the same data, are still not the same object. Instead, your test for equality needs to compare one or more properties of your object. For instance, if your object had an ID property, and objects with the same ID should be considered semantically equivalent, then you could do this:

result = lhs.ID == rhs.ID

Note that overriding Equals() means you should also override GetHashCode(), which is another kettle of fish, and can be quite difficult to do correctly.

您需要实现GetHashCode()。

It seems that a simple Distinct operation can be implemented more elegantly as follows:

var distinct = items.GroupBy(x => x.ID).Select(x => x.First());

where ID is the property that determines if two objects are semantically equivalent. From the confusion here (including that of myself), the default implementation of Distinct() seems to be a little convoluted.

我认为MyType需要实现IEquatable才能实现。

The other answers have pretty much covered the fact that you need to implement Equals and GetHashCode correctly, but as a side note you may be interested to know that anonymous types have these values implemented automatically:

var distinct =
        (from value in list
         select new {Value = value}).Distinct().ToArray();

So without ever having to define this class, you automatically get the Equals and GetHashCode behavior you're looking for. Cool, eh?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM