简体   繁体   English

如何通过 Lambda 或 LINQ 从列表中获取不同的实例

[英]How to get distinct instance from a list by Lambda or LINQ

I have a class like this:我有这样的课:

class MyClass<T> {
    public string value1 { get; set; }
    public T objT { get; set; }
}

and a list of this class.和这个类的列表。 I would like to use .net 3.5 lambda or linq to get a list of MyClass by distinct value1.我想使用 .net 3.5 lambda 或 linq 通过不同的 value1 获取 MyClass 列表。 I guess this is possible and much simpler than the way in .net 2.0 to cache a list like this:我想这是可能的,而且比 .net 2.0 中缓存这样的列表的方式简单得多:

List<MyClass<T>> list; 
...
List<MyClass<T>> listDistinct = new List<MyClass<T>>();
foreach (MyClass<T> instance in list)
{
    // some code to check if listDistinct does contain obj with intance.Value1
    // then listDistinct.Add(instance);
}

What is the lambda or LINQ way to do it? lambda 或 LINQ 的方法是什么?

Both Marc 's and dahlbyk 's answers seem to work very well. Marcdahlbyk的答案似乎都很好。 I have a much simpler solution though.我有一个更简单的解决方案。 Instead of using Distinct , you can use GroupBy .您可以使用GroupBy而不是使用Distinct It goes like this:它是这样的:

var listDistinct
    = list.GroupBy(
        i => i.value1,
        (key, group) => group.First()
    ).ToArray();

Notice that I've passed two functions to the GroupBy() .请注意,我已将两个函数传递给GroupBy() The first is a key selector.第一个是键选择器。 The second gets only one item from each group.第二个从每个组中只获得一个项目。 From your question, I assumed First() was the right one.根据您的问题,我认为First()是正确的。 You can write a different one, if you want to.如果你愿意,你可以写一个不同的。 You can try Last() to see what I mean.你可以试试Last()看看我的意思。

I ran a test with the following input:我使用以下输入进行了测试:

var list = new [] {
    new { value1 = "ABC", objT = 0 },
    new { value1 = "ABC", objT = 1 },
    new { value1 = "123", objT = 2 },
    new { value1 = "123", objT = 3 },
    new { value1 = "FOO", objT = 4 },
    new { value1 = "BAR", objT = 5 },
    new { value1 = "BAR", objT = 6 },
    new { value1 = "BAR", objT = 7 },
    new { value1 = "UGH", objT = 8 },
};

The result was:结果是:

//{ value1 = ABC, objT = 0 }
//{ value1 = 123, objT = 2 }
//{ value1 = FOO, objT = 4 }
//{ value1 = BAR, objT = 5 }
//{ value1 = UGH, objT = 8 }

I haven't tested it for performance.我还没有测试它的性能。 I believe that this solution is probably a little bit slower than one that uses Distinct .我相信这个解决方案可能比使用Distinct的解决方案慢一点。 Despite this disadvantage, there are two great advantages: simplicity and flexibility.尽管有这个缺点,但有两个很大的优点:简单性和灵活性。 Usually, it's better to favor simplicity over optimization, but it really depends on the problem you're trying to solve.通常,优先考虑简单性优于优化,但这实际上取决于您要解决的问题。

Hmm... I'd probably write a custom IEqualityComparer<T> so that I can use:嗯...我可能会编写一个自定义IEqualityComparer<T>以便我可以使用:

var listDistinct = list.Distinct(comparer).ToList();

and write the comparer via LINQ....并通过 LINQ 编写比较器....

Possibly a bit overkill, but reusable, at least:可能有点矫枉过正,但至少可以重复使用:

Usage first:用法先:

static class Program {
    static void Main() {
        var data = new[] {
            new { Foo = 1,Bar = "a"}, new { Foo = 2,Bar = "b"}, new {Foo = 1, Bar = "c"}
        };
        foreach (var item in data.DistinctBy(x => x.Foo))
            Console.WriteLine(item.Bar);
        }
    }
}

With utility methods:使用实用方法:

public static class ProjectionComparer
{
    public static IEnumerable<TSource> DistinctBy<TSource,TValue>(
        this IEnumerable<TSource> source,
        Func<TSource, TValue> selector)
    {
        var comparer = ProjectionComparer<TSource>.CompareBy<TValue>(
            selector, EqualityComparer<TValue>.Default);
        return new HashSet<TSource>(source, comparer);
    }
}
public static class ProjectionComparer<TSource>
{
    public static IEqualityComparer<TSource> CompareBy<TValue>(
        Func<TSource, TValue> selector)
    {
        return CompareBy<TValue>(selector, EqualityComparer<TValue>.Default);
    }
    public static IEqualityComparer<TSource> CompareBy<TValue>(
        Func<TSource, TValue> selector,
        IEqualityComparer<TValue> comparer)
    {
        return new ComparerImpl<TValue>(selector, comparer);
    }
    sealed class ComparerImpl<TValue> : IEqualityComparer<TSource>
    {
        private readonly Func<TSource, TValue> selector;
        private readonly IEqualityComparer<TValue> comparer;
        public ComparerImpl(
            Func<TSource, TValue> selector,
            IEqualityComparer<TValue> comparer)
        {
            if (selector == null) throw new ArgumentNullException("selector");
            if (comparer == null) throw new ArgumentNullException("comparer");
            this.selector = selector;
            this.comparer = comparer;
        }

        bool IEqualityComparer<TSource>.Equals(TSource x, TSource y)
        {
            if (x == null && y == null) return true;
            if (x == null || y == null) return false;
            return comparer.Equals(selector(x), selector(y));
        }

        int IEqualityComparer<TSource>.GetHashCode(TSource obj)
        {
            return obj == null ? 0 : comparer.GetHashCode(selector(obj));
        }
    }
}

You can use this extension method:您可以使用此扩展方法:

    IEnumerable<MyClass> distinctList = sourceList.DistinctBy(x => x.value1);

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
        this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector)
    {
        var knownKeys = new HashSet<TKey>();
        return source.Where(element => knownKeys.Add(keySelector(element)));
    }

Check out Enumerable.Distinct() , which can accept an IEqualityComparer:查看Enumerable.Distinct() ,它可以接受 IEqualityComparer:

class MyClassComparer<T> : IEqualityComparer<MyClass<T>>
{
    // Products are equal if their names and product numbers are equal.
    public bool Equals(MyClass<T> x, MyClass<T>y)
    {
        // Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        // Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        // Check whether the products' properties are equal.
        return x.value1 == y.value1;
    }

    // If Equals() returns true for a pair of objects,
    // GetHashCode must return the same value for these objects.

    public int GetHashCode(MyClass<T> x)
    {
        // Check whether the object is null.
        if (Object.ReferenceEquals(x, null)) return 0;

        // Get the hash code for the Name field if it is not null.
        return (x.value1 ?? "").GetHashCode();
    }
}

Your code snippet could look like this:您的代码片段可能如下所示:

List<MyClass<T>> list; 
...
List<MyClass<T>> listDistinct = list.Distinct(new MyClassComparer<T>).ToList();

这会更简单...

var distinctList = list.GroupBy(l => l.value1, (key, c) => l.FirstOrDefault());

在 linq 中,这比分组更先进

list.GroupBy(li => li.value, (key, grp) => li.FirstOrDefault());

I took Marc's answer, fixed it to work with TSource being a value type (test for default(TSource) instead of null), cleaned up some redundant type specifications, and wrote some tests for it.我接受了 Marc 的回答,将其修复为使用 TSource 作为值类型(测试 default(TSource) 而不是 null),清理了一些冗余类型规范,并为它编写了一些测试。 Here is what I am using today.这是我今天使用的。 Thank you Marc for the great idea and implementation.感谢 Marc 的好主意和实施。

public static class LINQExtensions
{
    public static IEnumerable<TSource> DistinctBy<TSource, TValue>(
        this IEnumerable<TSource> source,
        Func<TSource, TValue> selector)
    {
        var comparer = ProjectionComparer<TSource>.CompareBy(
            selector, EqualityComparer<TValue>.Default);
        return new HashSet<TSource>(source, comparer);
    }
}
public static class ProjectionComparer<TSource>
{
    public static IEqualityComparer<TSource> CompareBy<TValue>(
        Func<TSource, TValue> selector)
    {
        return CompareBy(selector, EqualityComparer<TValue>.Default);
    }
    public static IEqualityComparer<TSource> CompareBy<TValue>(
        Func<TSource, TValue> selector,
        IEqualityComparer<TValue> comparer)
    {
        return new ComparerImpl<TValue>(selector, comparer);
    }
    sealed class ComparerImpl<TValue> : IEqualityComparer<TSource>
    {
        private readonly Func<TSource, TValue> _selector;
        private readonly IEqualityComparer<TValue> _comparer;
        public ComparerImpl(
            Func<TSource, TValue> selector,
            IEqualityComparer<TValue> comparer)
        {
            if (selector == null) throw new ArgumentNullException("selector");
            if (comparer == null) throw new ArgumentNullException("comparer");
            _selector = selector;
            _comparer = comparer;
        }

        bool IEqualityComparer<TSource>.Equals(TSource x, TSource y)
        {
            if (x.Equals(default(TSource)) && y.Equals(default(TSource)))
            {
                return true;
            }

            if (x.Equals(default(TSource)) || y.Equals(default(TSource)))
            {
                return false;
            }
            return _comparer.Equals(_selector(x), _selector(y));
        }

        int IEqualityComparer<TSource>.GetHashCode(TSource obj)
        {
            return obj.Equals(default(TSource)) ? 0 : _comparer.GetHashCode(_selector(obj));
        }
    }
}

And the test class:和测试类:

[TestClass]
public class LINQExtensionsTest
{
    [TestMethod]
    public void DistinctByTestDate()
    {
        var list = Enumerable.Range(0, 200).Select(i => new
        {
            Index = i,
            Date = DateTime.Today.AddDays(i%4)
        }).ToList();

        var distinctList = list.DistinctBy(l => l.Date).ToList();

        Assert.AreEqual(4, distinctList.Count);

        Assert.AreEqual(0, distinctList[0].Index);
        Assert.AreEqual(1, distinctList[1].Index);
        Assert.AreEqual(2, distinctList[2].Index);
        Assert.AreEqual(3, distinctList[3].Index);

        Assert.AreEqual(DateTime.Today, distinctList[0].Date);
        Assert.AreEqual(DateTime.Today.AddDays(1), distinctList[1].Date);
        Assert.AreEqual(DateTime.Today.AddDays(2), distinctList[2].Date);
        Assert.AreEqual(DateTime.Today.AddDays(3), distinctList[3].Date);

        Assert.AreEqual(200, list.Count);
    }

    [TestMethod]
    public void DistinctByTestInt()
    {
        var list = Enumerable.Range(0, 200).Select(i => new
        {
            Index = i % 4,
            Date = DateTime.Today.AddDays(i)
        }).ToList();

        var distinctList = list.DistinctBy(l => l.Index).ToList();

        Assert.AreEqual(4, distinctList.Count);

        Assert.AreEqual(0, distinctList[0].Index);
        Assert.AreEqual(1, distinctList[1].Index);
        Assert.AreEqual(2, distinctList[2].Index);
        Assert.AreEqual(3, distinctList[3].Index);

        Assert.AreEqual(DateTime.Today, distinctList[0].Date);
        Assert.AreEqual(DateTime.Today.AddDays(1), distinctList[1].Date);
        Assert.AreEqual(DateTime.Today.AddDays(2), distinctList[2].Date);
        Assert.AreEqual(DateTime.Today.AddDays(3), distinctList[3].Date);

        Assert.AreEqual(200, list.Count);
    }

    struct EqualityTester
    {
        public readonly int Index;
        public readonly DateTime Date;

        public EqualityTester(int index, DateTime date) : this()
        {
            Index = index;
            Date = date;
        }
    }

    [TestMethod]
    public void TestStruct()
    {
        var list = Enumerable.Range(0, 200)
            .Select(i => new EqualityTester(i, DateTime.Today.AddDays(i%4)))
            .ToList();

        var distinctDateList = list.DistinctBy(e => e.Date).ToList();
        var distinctIntList = list.DistinctBy(e => e.Index).ToList();

        Assert.AreEqual(4, distinctDateList.Count);
        Assert.AreEqual(200, distinctIntList.Count);
    }
}

As of .NET 6 a new DistinctBy operator has been introduced.从 .NET 6 开始,引入了新的DistinctBy运算符。 So now we can write:所以现在我们可以写:

var listDistinct = list.DistinctBy(x => x.value1).ToList();

Here's the implementation in the source code if anyone's interested.如果有人感兴趣,这是源代码中的实现

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM