简体   繁体   中英

Remove all but 1 object in list based on grouping

I have a list of objects with multiple properties in it. Here is the object.

public class DataPoint
{
    private readonly string uniqueId;
    public DataPoint(string uid)
    {
        this.uniqueId = uid;
    }

    public string UniqueId
    {
        get
        {
            return this.uniqueId;
        }
    }

    public string ScannerID { get; set; }

    public DateTime ScanDate { get; set; }
}

Now in my code, I have a giant list of these, hundreds maybe a few thousand.

Each data point object belongs to some type of scanner, and has a scan date. I want to remove any data points that were scanned on the same day except for the last one for a given machine.

I tried using LINQ as follows but this did not work. I still have many duplicate data points.

this.allData = this.allData.GroupBy(g => g.ScannerID)
                   .Select(s => s.OrderByDescending(o => o.ScanDate))
                   .First()
                   .ToList();`

I need to group the data points by scanner ID, because there could be data points scanned on the same day but on a different machine. I only need the last data point for a day if there are multiple.

Edit for clarification - By last data point I mean the last scanned data point for a given scan date for a given machine. I hope that helps. So when grouping by scanner ID, I then tried to order by scan date and then only keep the last scan date for days with multiple scans.

Here is some test data for 2 machines:

Unique ID   Scanner ID      Scan Date
A1JN221169H07  49374    2003-02-21 15:12:53.000
A1JN22116BK08  49374    2003-02-21 15:14:08.000
A1JN22116DN09  49374    2003-02-21 15:15:23.000
A1JN22116FP0A  49374    2003-02-21 15:16:37.000 
A1JOA050U900J  80354    2004-10-05 10:53:24.000 
A1JOA050UB30K  80354    2004-10-05 10:54:39.000 
A1JOA050UD60L  80354    2004-10-05 10:55:54.000 
A1JOA050UF80M  80354    2004-10-05 10:57:08.000 
A1JOA0600O202  80354    2004-10-06 08:38:26.000 

I want to remove any data points that were scanned on the same day except for the last one for a given machine .

So I assume you want to group by both ScanDate and ScannerID . Here is the code:

var result = dataPoints.GroupBy(i => new { i.ScanDate.Date, i.ScannerID })
                       .OrderByDescending(i => i.Key.Date)
                       .Select(i => i.First())
                       .ToList();

If I understand you correctly this is what you want.

var result = dataPoints.GroupBy(i => new { i.ScanDate.Date, i.ScannerID })
                       .Select(i => i.OrderBy(x => x.ScanDate).Last())
                       .ToList();

This groups by the scanner id and the day ( SacnnerDate.Date will zero out the time portion), then for each grouping it orders by the ScanDate (since the groups are the same day this will order on the time) and takes the last. So for each day you will get one result for each scanner which has the latest ScanDate for that particular day.

Just as an aside, the class could be defined as

public class DataPoint
{
  public DataPoint(string uid)
  {
    UniqueId = uid;
  }

public string UniqueId {get; private set; }
public string ScannerID { get; set; }
public DateTime ScanDate { get; set; }

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM