简体   繁体   English

在DataTable中标记非唯一行

[英]Mark non-unique rows in a DataTable

I have a DataTable which I want to check if values in three of the columns are unique. 我有一个DataTable,我想检查三列中的值是否唯一。 If not, the last column should be filled with the line number of the first appearance of the value-combination. 如果不是,则应使用值组合的第一个外观的行号填充最后一列。

For example, this table: 例如,这个表:

ID    Name    LastName    Age    Flag
-------------------------------------
1     Bart    Simpson     10      -
2     Lisa    Simpson      8      -
3     Bart    Simpson     10      -
4     Ned     Flanders    40      -
5     Bart    Simpson     10      -

Should lead to this result: 应该导致这个结果:

Line  Name    LastName    Age    Flag
-------------------------------------
1     Bart    Simpson     10      -
2     Lisa    Simpson      8      -
3     Bart    Simpson     10      1
4     Ned     Flanders    40      -
5     Bart    Simpson     10      1

I solved this by iterating the DataTable with two nested for loops and comparing the values. 我通过使用两个嵌套的for循环迭代DataTable并比较这些值来解决这个问题。 While this works fine for a small amount of data, it gets pretty slow when the DataTable contains a lot of rows. 虽然这适用于少量数据,但当DataTable包含大量行时, 它会变得非常慢

My question is: What is the best/fastest solution for this problem, regarding that the amount of data can vary between let's say 100 and 20000 rows? 我的问题是:这个问题的最佳/最快解决方案是什么,关于数据量可以在100到20000行之间变化?
Is there a way to do this using LINQ? 有没有办法用LINQ做到这一点? (I'm not too familiar with it, but I want to learn!) (我对它不太熟悉,但我想学习!)

I can't comment on how you might do this in C#/VB with a data table, but if you could move it all to SQL, your query would look like: 我不能评论你如何使用数据表在C#/ VB中执行此操作,但如果你可以将它全部移动到SQL,你的查询将如下所示:

declare @t table (ID int, Name varchar(10), LastName varchar(10), Age int)
insert into @t values (1,     'Bart' ,   'Simpson',     10 )
insert into @t values (2,     'Lisa',    'Simpson' ,     8 )
insert into @t values (3,     'Bart',    'Simpson' ,    10 )
insert into @t values (4,     'Ned',     'Flanders' ,   40 )
insert into @t values (5 ,    'Bart',    'Simpson'   ,  10 )

select t.*,
(select min(ID) as ID
    from @t t2
    where t2.Name = t.Name
    and t2.LastName = t.LastName
    and t2.id < t.id)
from @t t

Here I've defined a table for demo purposes. 在这里,我为演示目的定义了一个表。 I suppose you might be able to translate this into LINQ. 我想你可以把它翻译成LINQ。

Okay, I think I got an answer myself. 好的,我想我自己得到了答案。 Based on the suggestion in James Wiseman's answer, I tried something with LINQ. 根据James Wiseman的回答中的建议,我尝试了一些LINQ。

Dim myErrnrFnct = Function( current, first) If(first <> current, first, 0)
Dim myQuery = From row As DataRow In myDt.AsEnumerable _
                      Select New With { _
                        .LINE = row.Item("LINE"), _
                        .NAME = row.Item("NAME"), _
                        .LASTNAME = row.Item("LASTNAME"), _
                        .AGE = row.Item("AGE"), _
                        .FLAG = myErrnrFnct(row.Item("LINE"), myDt.AsEnumerable.First(Function(rowToCheck) _
                                                                                        rowToCheck.Item("NAME") = row.Item("NAME") AndAlso _
                                                                                        rowToCheck.Item("LASTNAME") = row.Item("LASTNAME") AndAlso _
                                                                                        rowToCheck.Item("AGE") = row.Item("AGE")).Item("LINE")) _
                      }

With this query I get exactly the result that's described in the Question. 通过此查询,我可以得到问题中描述的结果。 The myErrnrFnct Function is necessary because I want the Flag column to have the value 0 if there is no other row with the same values. myErrnrFnct函数是必需的,因为如果没有其他行具有相同的值,我希望Flag列的值为0

To get a DataTable out of myQuery again, I had to add some extensions described here: 为了再次从myQuery获取DataTable,我不得不添加一些这里描述的扩展:
How to: Implement CopyToDataTable Where the Generic Type T Is Not a DataRow 如何:实现通用类型T不是DataRow的CopyToDataTable
And then, this line will do: 然后,这一行将做:

Dim myNewDt As DataTable = myQuery.CopyToDataTable()

This seems to work just fine. 这似乎工作得很好。 Any suggestions to do this better? 有什么建议可以做得更好吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM