[英]Linq group by optimization
I have a mysql table, with lots of data, which looks like this: 我有一个mysql表,有很多数据,如下所示:
fk_string CHAR(6) NOT NULL
timestamp BIGINT(20) NOT NULL
Note: there is a primary key defined that includes these two fields and there are other fields in the table, that will not be used in this example. 注意:定义的主键包含这两个字段,表中还有其他字段,在本示例中不会使用。
I would like for each fk_string, to get the total entry count, and the min and max timestamp. 我想为每个fk_string获取总条目数,以及最小和最大时间戳。 So I wrote the following Linq query:
所以我写了以下Linq查询:
var resQuery = db.Ticks.GroupBy(t => t.FkString)
.Select(t => new {
Key = t.Key
, Total = t.LongCount()
, First = t.Min(x => x.Timestamp)
, Last = t.Max(x => x.Timestamp)
});
The query generated by Linq is this one: Linq生成的查询是这样的:
SELECT
1 AS `C1`,
`GroupBy1`.`K1` AS `fk_string`,
`GroupBy1`.`A1` AS `C2`,
`GroupBy1`.`A2` AS `C3`,
`GroupBy1`.`A3` AS `C4`
FROM (SELECT
`Project1`.`fk_string` AS `K1`,
COUNT(1) AS `A1`,
MIN(`Project1`.`C1`) AS `A2`,
MAX(`Project1`.`C1`) AS `A3`
FROM (SELECT
`Extent1`.`fk_string`,
`Extent1`.`timestamp` AS `C1`
FROM `data` AS `Extent1`) AS `Project1`
GROUP BY `Project1`.`fk_string`) AS `GroupBy1`
The problem is that because I'm reusing the timestamp column, Linq adds another subselect instead of just selecting directly from the table. 问题是因为我正在重用timestamp列,Linq添加了另一个subselect而不是直接从表中选择。 The query I was expecting is this:
我期待的查询是这样的:
SELECT
1 AS `C1`,
`GroupBy1`.`K1` AS `fk_string`,
`GroupBy1`.`A1` AS `C2`,
`GroupBy1`.`A2` AS `C3`,
`GroupBy1`.`A3` AS `C4`
FROM (SELECT
`Project1`.`fk_string` AS `K1`,
COUNT(1) AS `A1`,
MIN(`Project1`.`C1`) AS `A2`,
MAX(`Project1`.`C1`) AS `A3`
FROM `data` AS `Project1`
GROUP BY `Project1`.`fk_string`) AS `GroupBy1`
and instead, the one generated by Linq takes 3x longer. 相反,Linq生成的那个时间要长3倍。 Is there any way to disable or control this behaviour?
有没有办法禁用或控制这种行为?
Perhaps moving the GroupBy
into the Select
? 也许将
GroupBy
转移到Select
? And getting rid if Ticks? 并且如果蜱虫摆脱?
Can you get LINQ to simply execute this: 你能让LINQ简单地执行这个:
SELECT fk_string AS FkString,
COUNT(*) AS Total,
MIN(timestamp) AS First,
MAX(timestamp) AS Last,
GROUP BY fk_string;
(without standing on your head to type the same stuff with extra keystrokes and function calls?) (没有站在你的头上用额外的击键和函数调用键入相同的东西?)
Also, for performance: 另外,为了表现:
INDEX(fk_string, timestamp)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.