简体   繁体   English

如何从对象列表中删除(通过 linq)重复项

[英]How to remove (via linq) duplicates from a List of objects

My main object, has a property which is a List of tags我的主要对象有一个属性,它是一个标签列表

    [SharedCosmosCollection("shared")]
    public class GlobalPageTemplate : ISharedCosmosEntity
    {
        /// <summary>
        /// Id
        /// </summary>
        [JsonProperty("Id")]
        public string Id { get; set; }

        /// <summary>
        /// Cosmos Entity name
        /// </summary>
        [CosmosPartitionKey]
        public string CosmosEntityName { get; set; }

        /// <summary>
        /// Page name
        /// </summary>
        public string ExtractedPageName { get; set; }

        /// <summary>
        /// Site collection Template Name
        /// </summary>
        public string ExtractedSitecollectionTemplateName { get; set; }

        /// <summary>
        /// GlobalDesignTenantId
        /// </summary>
        public string ExtractedGlobalDesignTenantId { get; set; }

        /// <summary>
        /// Global design tenant site collection url
        /// </summary>
        public string ExtractedGlobalDesigntenantSiteCollectionUrl { get; set; }


        /// <summary>
        /// Page template picture Url
        /// </summary>
        public string PageTemplatePictureUrl { get; set; }

        /// <summary>
        /// Base64 image of the page template
        /// </summary>
        public string Base64Image { get; set; }

        /// <summary>
        /// Name of the template
        /// </summary>
        public string PageTemplateName { get; set; }


        /// <summary>
        /// Page sections
        /// </summary>
        public List<Section> Sections { get; set; }

        /// <summary>
        /// Tags
        /// </summary>
        public List<Tag> Tags { get; set; }
    }

Tag object is here:标签对象在这里:

 public class Tag : ISharedCosmosEntity
    {
        /// <summary>
        /// Id
        /// </summary>
        [JsonProperty("Id")]
        public string Id { get; set; }
        /// <summary>
        /// Tag name
        /// </summary>
        public string TagName { get; set; }
        /// <summary>
        /// cosmos entity name
        /// </summary>
        [CosmosPartitionKey]
        public string CosmosEntityName { get; set; }
    }

In my WebAPI, from the frontend, I might get duplicate tags,在我的 WebAPI 中,从前端,我可能会得到重复的标签,

how do I remove them and leave a clean list of tags before saving?如何在保存之前删除它们并留下干净的标签列表?

Can I suggest altering your data structure that stores your tags to a HashSet ?我可以建议将存储标签的数据结构更改为HashSet吗? If so, you can then do something like this .如果是这样,你就可以做一些像这样

A HashSet is an unordered collection of unique elements. HashSet 是唯一元素的无序集合。 It is generally used when we want to prevent duplicate elements from being placed in a collection.它通常用于我们想要防止在集合中放置重复元素的情况。 The performance of the HashSet is much better in comparison to the list.与列表相比,HashSet 的性能要好得多。

Essentially, you supply a custom IEqualityComparer to your HashSet on initialization.本质上,您在初始化时为您的 HashSet 提供了一个自定义的 IEqualityComparer。

public class TagComparer : IEqualityComparer<Tag>
{
    public bool Equals(Tag x, Tag y)
    {
        return x.Id.Equals(y.Id, StringComparison.InvariantCultureIgnoreCase);
    }

    public int GetHashCode(Tag obj)
    {
        return obj.Id.GetHashCode();
    }
}

And then you can do然后你可以做

HashSet<Tag> Tags = new HashSet<Tag>(new TagComparer());

In general, I always try to use data structures that make sense for the problem at hand.一般来说,我总是尝试使用对手头问题有意义的数据结构。 If you know you'll always want this collection to have unique elements, then I suggest you use a HashSet.如果您知道您将始终希望此集合具有唯一元素,那么我建议您使用 HashSet。

If you can't use a HashSet and you want to stick with a list, you can use linq's Distinct method on your Tags list and pass in the TagComparer object from above .如果您不能使用 HashSet 并且想坚持使用列表,则可以在标签列表上使用 linq 的 Distinct 方法并从上面传入 TagComparer 对象。

List<Tag> DistinctTagList = Tags.Distict(new TagComparer())

what you are looking for is probably the distict method: https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinct?view=netframework-4.8您正在寻找的可能是 distict 方法: https ://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinct ? view = netframework-4.8

for that you would also need to write an IEqualityComparer, which can simply compare by property https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.iequalitycomparer-1?view=netframework-4.8为此,您还需要编写一个 IEqualityComparer,它可以简单地按属性进行比较https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.iequalitycomparer-1?view=netframework-4.8

Then you could call it on your Enumerable:然后你可以在你的 Enumerable 上调用它:

var distinctTags = Tags.Distict(new TagEqualityComparer)

And the equalityComparer:和equalityComparer:

class TagEqualityComparer : IEqualityComparer<Tag>
{
    public bool Equals(Tag t1, Tag t2)
    {
        if (t2 == null && t1 == null)
           return true;
        else if (t1 == null || t2 == null)
           return false;
        else if(t1.Id == t2.Id)
            return true;
        else
            return false;
    }

    public int GetHashCode(Tag t)
    {
        // any custom hashingfunction here
    }
}

Using only linq you can do this:使用 linq 你可以做到这一点:

If tags have unique ids:如果标签具有唯一 ID:

tags.GroupBy(x => x.Id).Select(x => x.First()).ToList();

If you need to compare all columns:如果需要比较所有列:

tags.GroupBy(x => new {x.Id, x.TagName, x.CosmosEntityName}).Select(x => x.First()).ToList();

Not exactly an answer to your question (the other answers are all valid solutions for that), but if for some reason you're looking to actually extract your duplicate objects, such as for debugging, error processing, whatever, I wanted to offer the below.不完全是您问题的答案(其他答案都是针对该问题的有效解决方案),但是如果出于某种原因您希望实际提取重复的对象,例如用于调试、错误处理等,我想提供以下。

var duplicates = someList
  .GroupBy(r => r.Id)
  .Where(g => g.Count() > 1)
  .ToList();

Then you have a slightly different way to manage your list from pure distinct然后你有一种稍微不同的方式来管理你的列表从纯粹的不同

someList = someList.Except(duplicates).ToList();

Which is then a list of keys which had no duplicates.这是一个没有重复的键列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM