[英]c# Remove older duplicates DataTable based on condition
I have a DataTable with duplicated rows. 我有一个重复的行的数据表。 The rows are equal but the Date is different and there are columns where there isn't a date inserted. 行是相等的,但日期是不同的,并且在某些列中没有插入日期。 I need to remove the older duplicates and remove the duplicates where no date was inserted. 我需要删除较旧的副本,并删除没有插入日期的副本。 I have following code but it does not work because it can't convert fields into DateTime
to compare. 我有以下代码,但无法正常工作,因为它无法将字段转换为DateTime
进行比较。
public DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Add list of all the unique item value to hashtable, which stores
combination of key, value pair.
//And add duplicate item value in arraylist.
foreach (DataRow drow in dTable.Rows)
{
foreach (DataRow drow2 in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
{
if (DateTime.TryParse(drow["date"].ToString(), out DateTime res))
{
if (Convert.ToDateTime(drow["date"])<Convert.ToDateTime(drow2["date"]))
{
duplicateList.Add(drow);
}
if (Convert.ToDateTime(drow["date"]) > Convert.ToDateTime(drow2["date"]))
{
duplicateList.Add(drow);
}
}
}
else
hTable.Add(drow[colName], string.Empty);
}
}
//Removing a list of duplicate items from datatable.
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
//Datatable which contains unique records will be return as output.
return dTable;
}
So you want to remove duplicates according to the date
-column? 因此,您要根据date
-column删除重复项吗? You can use LINQ: 您可以使用LINQ:
List<DataRow> duplicateRows = dTable.AsEnumerable()
.GroupBy(r => r[colName])
.SelectMany(g => g
.Select(r => new
{
Row = r,
Date = DateTime.TryParse(r.Field<string>("date"), out DateTime date)
? date : new DateTime?()
})
.OrderByDescending(x => x.Date.HasValue)
.ThenByDescending(x => x.Date.GetValueOrDefault())
.Skip(1)) // only the first row of the group will be retained
.Select(x => x.Row)
.ToList();
duplicateRows.ForEach(dTable.Rows.Remove);
So first all rows which contain a date-value, this group ordered by the date-value itself descending, so newest first, all other rows are duplicates. 因此,首先所有包含日期值的行,该组按日期值本身的降序排列,所以最新的第一行,所有其他行都是重复的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.