简体   繁体   English

使用List.Distinct()时返回结果List中的项的顺序

[英]The order of returning items in the resultant List when using List.Distinct()

What items will be removed from the resultant list when DISTINCT.ToList() is applied in my following illustration? 在下图中应用DISTINCT.ToList()时,将从结果列表中删除哪些项目? Is the first entry (ie which was first added to the list) among duplicates be preserved in the new list being returned? 重复项中的第一个条目(即最先添加到列表中)是否保留在返回的新列表中? If not, is there a way to make the DISTINCT.ToList() to preserve the first entry among duplicates in the new list being returned? 如果不是,是否有办法使DISTINCT.ToList()保留返回的新列表中重复项中的第一个条目?

    Dim values As List(Of Integer) = New List(Of Integer)
    values.Add(1)
    values.Add(5)
    values.Add(2)
    values.Add(3)
    values.Add(2)
    values.Add(3)
    values.Add(4)
    values.Add(2)
    values.Add(2)
    values.Add(3)
    values.Add(3)
    values.Add(3)

    Dim items As List(Of Integer) = values.Distinct().ToList

    ' Display result.
    For Each i As Integer In items
        Console.WriteLine(i)
    Next

Expected output:
1
5
2
3
4

This MSDN page says " The Distinct(Of TSource)(IEnumerable(Of TSource)) method returns an unordered sequence that contains no duplicate values ". MSDN页面上显示“ Distinct(Of TSource)(IEnumerable(Of TSource))方法返回不包含重复值的无序序列 ”。 Is there a way to get around this? 有办法解决这个问题吗?

No you can't use Distinct to work around that. 不,您不能使用Distinct解决此问题。 As it happens it works exactly as you expect but the documentation explicitly states that it is not guaranteed. 发生这种情况时,它的工作原理完全符合您的预期,但是文档明确声明不能保证。 Therefore the implementation can change in future versions of the framework so you cannot rely on it. 因此,实现可以在框架的将来版本中更改,因此您不能依赖它。 The method is trivial to write. 该方法编写起来很简单。 In fact you can even copy the framework implementation . 实际上,您甚至可以复制框架实现

Again - it currently works as you want it to but it is not guaranteed to do so in the future. 同样,它目前可以按您的要求运行,但不能保证将来会这样做。

On the other hand I am pretty confident that this implementation will never change as I cannot imagine that more efficient implementation exists. 另一方面,我非常有信心该实现不会改变,因为我无法想象存在更有效的实现。

Here is an implementation for completeness (sorry it's C# and not VB.NET) 这是完整性的实现(很抱歉,这是C#而不是VB.NET)

public static class MyEnumerable
{
    public static IEnumerable<T> Distinct<T>(this IEnumerable<T> source)
    {
        if (source == null)
        {
            throw new ArgumentNullException(nameof(source));
        }

        var items = new HashSet<T>();

        foreach (T item in source)
        {
            if (items.Add(item))
            {
                yield return item;
            }
        }
    }
}

No you can't get around it with standard methods provided by the framework. 不,您无法使用框架提供的标准方法来解决它。 You can go around it by coding it yourself like Stilgar suggested. 您可以按照Stilgar的建议自己编写代码来解决它。

With your example provided selecting the first item by index is technically irrelevant as you wont be able to know if it was the first or 100th occurrence that in the list since Int are structure. 在提供示例的情况下,按索引选择第一项在技术上是不相关的,因为由于Int是结构,因此您将无法知道列表中是第一个还是第100个出现。

But that said i am guessing you are using a custom object. 但这就是说我猜您正在使用自定义对象。 In that case your order comes from some sort of sorting. 在这种情况下,您的订单来自某种排序。 In that case i suggest you instead use a GroupBy<> and then for each group order the items by your OrderBy<> statement and do a First<> on that. 在那种情况下,我建议您改为使用GroupBy<> ,然后针对每个组通过OrderBy<>语句订购商品,并对该商品进行First<>

Group by and Distinct are very close. Group by和Distinct非常接近。 distinct can be replaced by a group by and then first on each group. 可以用组替换“唯一”,然后在每个组上首先进行替换。 Indeed it's much slower than the real implementation but the goal here is to explain how you can used that to customize the output if you eventually need more than simply the first item. 确实,它比实际的实现要慢得多,但是这里的目标是说明如果最终需要的不仅仅是第一项,那么如何使用它来定制输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM