简体   繁体   English

使用Linq获取字段中具有重复值的最后N行

[英]Using Linq to get the last N number of rows that have duplicated values in a field

Given a database table, a column name C , and a number N larger than 1, how can I get a group of rows with equal values of column C which has at least N rows? 给定一个数据库表,一个列名C ,一个N大于1的数字,我如何才能得到一组具有相等列C值的行,而该列至少包含N行? If there exists more than one such group, I need to get the group which contains the newest entry (the one with the largest Id). 如果存在多个这样的组,则需要获取包含最新条目的组(具有最大ID的组)。

Is it possible to do this using LINQ to Entities? 是否可以使用LINQ to Entities来做到这一点?

Example:

> Id | Mycolumn
> - - - - - - -  
> 1 | name55555
> 2 | name22
> 3 | name22
> 4 | name22
> 5 | name55555
> 6 | name55555
> 7 | name1

Primary Key: ID
OrderBy: ID
Repeated column: Mycolumn

If N = 3 and C = Mycolumn , then we need to get rows which have the column MyColumn duplicated at least 3 times. 如果N = 3C = Mycolumn ,则我们需要获取具有MyColumn列重复至少3次的行。

For the example above, it should return rows 1, 5 and 6, because last index of name55555 is 6 , and last index of name22 (which is also repeated 3 times) is 4 . 对于上面的示例,它应返回第1、5和6行,因为name55555最后一个索引为6name22最后一个索引(也重复了3次)为4

data.Mytable
    .OrderByDescending(m => m.Id)
    .GroupBy(m => m.Mycolumn)
    .FirstOrDefault(group => group.Count() >= N)
    .Take(N)
    .Select(m => m.Id)

If the rows are identical (all columns) then frankly there's no point fetching more than one of each - they will be indistinguishable; 如果行是相同的 (所有列),那么坦率地说,没有意义要取多于一个-它们将是无法区分的。 I don't know about LINQ, but you can do something like: 我不了解LINQ,但是您可以执行以下操作:

select id, name /* more cols */, count(1) from @foo
group by id, name /* more cols */ having count(1) > 1

You can probably do that in link using GroupBy etc. If they aren't entirely identical (for example, the IDENTITY is different, but the other columns are the same), it gets more difficult, and certainly there is no easy LINQ syntax for it; 可能可以使用GroupBy等在链接中执行此操作。如果它们不完全相同(例如, IDENTITY不同,但其他列相同),则会变得更加困难,并且肯定没有简单的LINQ语法可用于它; at the TSQL level, though: 但是,在TSQL级别:

select id, name /* more cols */
from (
select id, name /* more cols */,
    ROW_NUMBER() over (partition by name /* more cols */ order by id) as [_row] 
from @foo) x where x._row > 1

I have scratched this together in Linqpad, which should give you the wanted results: 我在Linqpad中将其抓了起来,应该可以得到想要的结果:

int Border = 3;
var table = new List<table> 
{
  new table {Id = 1, Value = "Name1"},
  new table {Id = 2, Value = "Name2"},
  new table {Id = 3, Value = "Name5"},
  new table {Id = 4, Value = "Name5"},
  new table {Id = 5, Value = "Name2"},
  new table {Id = 6, Value = "Name5"},
  new table {Id = 7, Value = "Name5"},
};

var results = from p in table
              group p.Id by p.Value into g
              where g.Count() > Border
              select new {rows = g.ToList()};
//only in LP
results.Dump();

this yields the rows 3, 4, 6, 7. 这将产生第3、4、6、7行。

However: You only want the last occurence, not all, so you have to query results again: 但是:您只想要最后一次出现,而不是全部,所以您必须再次查询结果:

results.Skip(Math.Max(0, results.Count() - 1)).Take(1);

Kind regards 亲切的问候

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM