简体   繁体   English

如何使用IEqualityComparer <T> .Equals()在ToLookUp中 <T> () 延期

[英]How to use IEqualityComparer<T>.Equals() in ToLookUp<T>() Extension

I stumbled upon an article regarding the Birthday Paradox and it's implications when overriding the GetHashCode method, I find myself in a bind. 我偶然发现了一篇有关“ 生日悖论” 的文章 ,它在重写GetHashCode方法时的含义是,我发现自己陷入了困境。

In tests, we found that in calls to the ToLookup() Extension , only GetHashcode is used, despite providing the implementation for Equals. 在测试中,我们发现在对ToLookup()扩展的调用中,尽管提供了Equals的实现,但仅使用GetHashcode

I think I understand why this happens, the internal working of ToLookup , HashSet , Dictionary , etc, use the HashCodes to store and/or index their elements? 我想我理解为什么会发生这种情况, ToLookupHashSetDictionary等的内部工作使用HashCodes存储和/或索引它们的元素?

Is there a way to somehow provide the functionality so that the equality comparison is actual performed using the equals method? 有没有办法以某种方式提供功能,以便使用equals方法实际执行相等比较? Or should I not be concerned with the collisions? 还是我不应该担心碰撞? I haven't done the maths myself, but according to the first article I linked, you would only need 77,163 elements in a list before reaching a 50% chance of collision. 我还没有自己做数学运算,但是根据我链接的第一篇文章,在达到50%的碰撞概率之前,列表中只需要77,163个元素。

If I understand this correctly, an Equals() override that compares property by property such as 如果我正确理解这一点,则可以使用Equals()重写来按属性比较属性,例如

Return (a.Property1 == b.Property1 && a.Property2 == b.Property2 && ...)

should have a zero chance of collision? 应该有零碰撞的机会? So how can I get my ToLookup() to equality compare this way? 那么,如何通过这种方式让我的ToLookup()进行相等比较?


In case you need an example of what I mean: 如果您需要我的意思的例子:

C# C#

class Program
{

    static void Main(string[] args)
    {
        DoStuff();
        Console.ReadKey();
    }

    public class AnEntity
    {
        public int KeyProperty1 { get; set; }
        public int KeyProperty2 { get; set; }
        public int KeyProperty3 { get; set; }
        public string OtherProperty1 { get; set; }
        public List<string> OtherProperty2 { get; set; }
    }

    public class KeyEntity
    {
        public int KeyProperty1 { get; set; }
        public int KeyProperty2 { get; set; }
        public int KeyProperty3 { get; set; }
    }

    public static void DoStuff()
    {
        var a = new AnEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3, OtherProperty1 = "foo"};
        var b = new AnEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3, OtherProperty1 = "bar"};
        var c = new AnEntity {KeyProperty1 = 999, KeyProperty2 = 999, KeyProperty3 = 999, OtherProperty1 = "yada"};

        var entityList = new List<AnEntity> { a, b, c };

        var lookup = entityList.ToLookup(n => new KeyEntity {KeyProperty1 = n.KeyProperty1, KeyProperty2 = n.KeyProperty2, KeyProperty3 = n.KeyProperty3});

        // I want these to all return true
        Debug.Assert(lookup.Count == 2);
        Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3}].First().OtherProperty1 == "foo");
        Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3}].Last().OtherProperty1 == "bar");
        Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 999, KeyProperty2 = 999, KeyProperty3 = 999}].Single().OtherProperty1 == "yada");
    }

}

VB VB

Module Program

    Public Sub Main(args As String())
        DoStuff()
        Console.ReadKey()
    End Sub

    Public Class AnEntity
        Public Property KeyProperty1 As Integer
        Public Property KeyProperty2 As Integer
        Public Property KeyProperty3 As Integer
        Public Property OtherProperty1 As String
        Public Property OtherProperty2 As List(Of String) 
    End Class

    Public Class KeyEntity
        Public Property KeyProperty1 As Integer
        Public Property KeyProperty2 As Integer
        Public Property KeyProperty3 As Integer
    End Class

    Public Sub DoStuff()
        Dim a = New AnEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3, .OtherProperty1 = "foo"}
        Dim b = New AnEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3, .OtherProperty1 = "bar"}
        Dim c = New AnEntity With {.KeyProperty1 = 999, .KeyProperty2 = 999, .KeyProperty3 = 999, .OtherProperty1 = "yada"}

        Dim entityList = New List(Of AnEntity) From {a, b, c}

        Dim lookup = entityList.ToLookup(Function(n) New KeyEntity With {.KeyProperty1 = n.KeyProperty1, .KeyProperty2 = n.KeyProperty2, .KeyProperty3 = n.KeyProperty3})

        ' I want these to all return true
        Debug.Assert(lookup.Count = 2)
        Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3}).First().OtherProperty1 = "foo")
        Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3}).Last().OtherProperty1 = "bar")
        Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 999, .KeyProperty2 = 999, .KeyProperty3 = 999}).Single().OtherProperty1 = "yada")
    End Sub

End Module

I can get that to work with an override of GetHashcode() , no problems. 我可以GetHashcode()GetHashcode()的重写一起工作,没有问题。 But I don't want to use GetHashcode because if I have, for example, 109,125 elements in my list, apparently I'm already at 75% chance of collision? 但是我不想使用GetHashcode因为例如,如果我的列表中有109,125个元素,显然我已经有75%的碰撞机会了? If it used aforementioned Equals() override, I think I'd be at 0%? 如果它使用前面提到的Equals()覆盖,我想我应该是0%?

The article that you've linked to is completely misleading (and many of its comments highlight this). 您链接到的文章完全具有误导性(很多评论都强调了这一点)。

GetHashCode is used where possible because it's fast; 因为速度快,所以在可能的情况下使用GetHashCode if there are hash collisions then Equals is used to disambiguate between the colliding items. 如果存在哈希冲突,则使用Equals消除冲突项之间的歧义。 So long as you implement Equals and GetHashCode correctly -- whether in the types themselves or a custom IEqualityComparer<T> implementation -- then there won't be any problems. 只要您正确地实现EqualsGetHashCode (无论是类型本身还是自定义IEqualityComparer<T>实现),就不会有任何问题。

The problem with your example code is that you're not overriding Equals and GetHashCode at all. 示例代码的问题在于,您根本没有覆盖EqualsGetHashCode This means that the the default implementations are used, and the default implementations use reference comparisons for reference types, not value comparisons. 这意味着将使用默认实现,并且默认实现将引用比较用于引用类型,而不是值比较。

This means that you're not getting hash collisions because the objects you're comparing against are different to the original objects , even though they have the same values. 这意味着您不会遇到哈希冲突,因为要比较的对象与原始对象不同 ,即使它们具有相同的值。 This, in turn, means that Equals just isn't required by your example code. 反过来,这意味着示例代码不需要Equals Override Equals and GetHashCode correctly, or set up an IEqualityComparer<T> to do so, and everything will start working as you expect. 正确覆盖EqualsGetHashCode ,或设置IEqualityComparer<T>来执行此操作,一切将按预期开始。

The birthday paradox does not apply in this situation. 生日悖论不适用于这种情况。 The birthday paradox relates to non-deterministic random sets, whereas hashcode computation is determinitic. 生日悖论与非确定性随机集有关,而哈希码计算是确定性的。 the chances of 2 objects with different state sharing the same hashcode is much closer to 1 in a billion or so, certainly not as low as 77 thousand - therefore I dont think you have anything to worry about. 处于不同状态的2个对象共享相同的哈希码的几率非常接近十亿分之一,肯定不低于7.7万-因此,我认为您无需担心。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM