[英]How to optimize SELECT some_field, max(primary_key) FROM table GROUP BY some_field
I have SQL query in SQL Azure: 我在SQL Azure中有SQL查询:
SELECT some_field, max(primary_key) FROM table GROUP BY some_field
Table has currently over 6 million rows. 表格目前有超过600万行。 Index on (some_field asc, primary_key desc) is created.
创建索引开(some_field asc,primary_key desc)。 primary_key field is incremental.
primary_key字段是增量的。 There is about 700 distinct values of some_field.
some_field大约有700个不同的值。 This select takes at least 30 seconds.
此选择至少需要30秒。
There are only inserts into this table, no updates or deletes. 此表中只有插入,没有更新或删除。
I can create separate table to store some_field and maximal value of primary key and write trigger to build it, but I am looking for more elegant solution. 我可以创建单独的表来存储some_field和主键的最大值,并编写触发器来构建它,但是我正在寻找更优雅的解决方案。 Is there any?
有没有?
Dont know if this will be performant but you you can give it a shot... 不知道这是否会表现出色,但您可以试一试...
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY some_field ORDER BY primary_key DESC) AS rn
FROM table
)
SELECT *
FROM cte
WHERE rn = 1
Definitely do the secondary table of "somefield" and "highestPK" columns that is indexed on the "somefield" column. 一定要在“ somefield”列上建立索引的“ somefield”和“ highestPK”列的辅助表。 Build that once up front as a baseline and use that.
首先将其构建为基准并使用它。
Then, whenever any new records are inserted into your 6 million record table, have a simple trigger to update your secondary table with something as simple as.. 然后,每当将任何新记录插入600万个记录表中时,都需要一个简单的触发器,以使用如下简单的方法来更新辅助表。
update SecondaryTable
set highestPK = newlyInsertedPKID
where somefield = newlyInsertedSomeFieldValue
This way, it stays updated with every insert as the highest PK for your "somefield" column will qualify, and if no update is available, insert into the secondary table with the new "somefield" value. 这样,每次插入都会保持更新,因为“ somefield”列的最高PK将符合条件,并且如果没有可用的更新,则使用新的“ somefield”值插入辅助表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.