简体   繁体   English

Linq Distinct没有带回正确的结果

[英]Linq Distinct not bringing back the correct results

I'm trying to select a distinct values from a DataTable using Linq . 我正在尝试使用LinqDataTable选择不同的值。 The DataTable gets populated from an excel sheet which has dynamic column apart from each excel sheet has a column name SERIAL NUMBER which is mandatory. DataTable从excel表中填充,该表具有动态列,除了每个excel表具有列名称SERIAL NUMBER ,这是必需的。

I have a DataTable for demo purpose which consist of 4 serial number as: 我有一个用于演示目的的DataTable ,它由4个序列号组成:

  • 12345 12345
  • 12345 12345
  • 98765 98765
  • 98765 98765

When I do 当我做

var distinctList = dt.AsEnumerable().Select(a => a).Distinct().ToList();

If I do 如果我做

var distinctList = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER").Distinct().ToList();

Then I get the correct results, however but it only contains the one column from dt and not all the other columns 然后我得到了正确的结果,但是它只包含来自dt的一列而不是所有其他列

I get all four records instead of 2. Can someone tell me where I'm going wrong please. 我得到所有四个记录而不是2.有人可以告诉我我哪里出错了。

The problem is that Distinct method by default uses the default equality comparer, which for DataRow is comparing by reference . 问题是默认情况下, Distinct方法使用默认的相等比较器, DataRow通过引用进行比较。 To get the desired result, you can use the Distinct overload that allows you to pass IEqualityComparer<T> , and pass DataRowComparer.Default : 要获得所需的结果,可以使用Distinct 重载 ,它允许您传递IEqualityComparer<T> ,并传递DataRowComparer.Default

The DataRowComparer<TRow> class is used to compare the values of the DataRow objects and does not compare the object references. DataRowComparer <TRow>类用于比较DataRow对象的值,而不比较对象引用。

var distinctList = dt.AsEnumerable().Distinct(DataRowComparer.Default).ToList();

For more info, see Comparing DataRows (LINQ to DataSet) . 有关更多信息,请参阅比较DataRows(LINQ to DataSet)

In ToTable method the first parameter specifies if you want Distinct records, the second specify by which column name we will make distinct. 在ToTable方法中,第一个参数指定是否需要Distinct记录,第二个参数指定我们将使哪个列名称不同。

DataTable returnVals = dt.DefaultView.ToTable(true, "ColumnNameOnWhichYouWantDistinctRecords");

Here there is no need to use linq for this task ! 这里没有必要使用linq来完成这项任务!

So, you want to group them by Serial Number and retrieve the full DataRow? 那么,您想按序列号对它们进行分组并检索完整的DataRow吗? Assuming that after grouping them we want to retrieve the first item: 假设在对它们进行分组后我们想要检索第一个项目:

var distinctList = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER"))
                       .Select(a => a.FirstOrDefault()).Distinct().ToList();

EDIT: As requested 编辑:根据要求

var distinctValues = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER")).Distinct().ToList();
var duplicateValues = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER")).SelectMany(a => a.Skip(1)).Distinct().ToList();
var duplicatesRemoved = dt.AsEnumerable().Except(duplicateValues);

Using Linq a GroupBy would be better suited, by the sounds of it. 使用Linq,GroupBy会更好地适应它。

var groups = dt.AsEnumerable().GroupBy(a => a.SerialNumber).Select(_ => new {Key = _.Key, Items = _});

This will then contain groupings based on the Serial Number. 然后,这将包含基于序列号的分组。 With each group of items having the same serial number, but other property values different. 每组项目具有相同的序列号,但其他属性值不同。

Try this: 尝试这个:

List<string> distinctValues = (from row in dt.AsEnumerable() select row.Field<string>("SERIAL NUMBER")).Distinct().ToList();

However to me this also works: 不过对我来说这也有效:

List<string> distinctValues = dt.AsEnumerable().Select(row => row.Field<string>("SERIAL NUMBER")).Distinct().ToList();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM