简体   繁体   English

C#如何过滤列表并删除重复项?

[英]C# How to filter a list and remove duplicates?

I have a List of Type X. This contains fields and I need to return only unique records from the list. 我有一个类型X的列表。这包含字段,我需要从列表中只返回唯一的记录。 I need to use one of the field/property (OIndex) that contains a timestamp and filter it using that property. 我需要使用包含时间戳的字段/属性(OIndex)之一并使用该属性对其进行过滤。 List is like this: 列表是这样的:

> 2c55-Checked-branchDeb-20160501121315-05
> 2c60-Checked-branchDeb-20160506121315-06
> 2c55-Checked-branchDeb-20160601121315-07
> 2c55-Checked-branchDeb-20160601141315-07
> 2c60-Checked-branchDeb-20160720121315-08

In the example above the last field is the recordId so we have a duplicate record of "07". 在上面的示例中,最后一个字段是recordId,因此我们有一个重复的记录“07”。 The timestamp is field four. 时间戳是第四场。 So I want to get the all the records except that 3rd which is a duplicate. 所以我想获得所有记录,除了第3个是重复的。 The latest version of record "07" is the fourth line. 记录“07”的最新版本是第四行。

I started doing the code but struggling. 我开始做代码但很挣扎。 So far: 至今:

List<X> originalRecords = GetSomeMethod(); //this method returns our list above

var duplicateKeys = originalRecords.GroupBy(x => x.Record)  //x.Record is the record as shown above "05", "06" etc
                        .Where(g => g.Count() > 1)
                        .Select(y => y.Key);

What do I do now? 现在我该怎么做? Now that I have the duplicate keys. 现在我有了重复的密钥。 I think I need to go through the OriginalRecords list again and see if it contains the duplicate key. 我想我需要再次查看OriginalRecords列表,看看它是否包含重复键。 And then use substring on the datetime. 然后在datetime上使用substring。 Store this somewhere and then remove the record which is not the latest. 将其存储在某处,然后删除不是最新的记录。 And save the original records with the filter. 并使用过滤器保存原始记录。 Thanks 谢谢

You don't need to find duplicate keys explicitly, you could simply select first from each group: 您不需要明确地找到重复的键,您只需从每个组中选择第一个:

var res == originalRecords
    .GroupBy(x => x.RecordId)
    .Select(g => g.OrderByDescending(x => x.DateTimeField).First());

There is no field for datetimefield as in your code. 在代码中没有datetimefield的字段。 I simply have a string field which contains the datetime together with other data. 我只是有一个字符串字段,其中包含日期时间和其他数据。 The record however has a Record Id field. 但是该记录有一个Record Id字段。

You can split your records on a dash, grab the date-time portion, and sort on it. 您可以在短划线上拆分记录,获取日期时间部分,然后对其进行排序。 Your date/time is in a format that lets you sort lexicographically, so you can skip parsing the date. 您的日期/时间采用允许按字典顺序排序的格式,因此您可以跳过解析日期。

Assuming that there are no dashes, and that all strings are formatted in the same way, x.TextString.Split('-')[3] expression will give you the timestamp portion of your record: 假设没有破折号,并且所有字符串都以相同的方式格式化, x.TextString.Split('-')[3]表达式将为您提供记录的时间戳部分:

var res == originalRecords
    .GroupBy(x => x.RecordId)
    .Select(g => g.OrderByDescending(x => x.TextString.Split('-')[3]).First());

This should solve your problem: 这应该可以解决您的问题:

List<X> originalRecords = GetSomeMethod();
Dictionary<int, X> records = new Dictionary<int, X>();

foreach (X record in originalRecords) {

    if(records[record.recordId] != null) {
        if(records[record.recordId].stamp < record.stamp){
            records[record.recordId] = record;
        }
    }
    else {
        records[record.recordId] = record;
    }
}

Your answer are records.Values 你的答案是记录。价值

Hope it helps 希望能帮助到你

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM