简体   繁体   English

SQL Server-使用GROUP BY查询大表的性能

[英]SQL Server - Performance of Querying Large Tables With GROUP BY

I have table "TRANSACTION" in Sql Server 2008. Approximately, 6 records in 1 second are inserted into this table. 我在Sql Server 2008中具有表“ TRANSACTION”。大约在1秒内将6条记录插入到该表中。 (Since it is financial transactions table) So, in 1 day, 500.000 records are inserted. (由于它是金融交易表)因此,在1天之内将插入500.000条记录。 Table is partitioned weekly. 表每周进行分区。

This table is heavily used for many kind of select (with NOLOCK, of course), insert, update operations. 该表大量用于多种选择(当然带有NOLOCK),插入,更新操作。

Do you think the query below may slow down other critical select, insert, update operations on the same table? 您是否认为以下查询可能会使同一张表上的其他关键选择,插入,更新操作变慢 I think, even if the query below lasts too long, other select queries are not going to slow down since this query does not lock the table. 我认为,即使下面的查询持续太长时间,其他选择查询也不会减慢速度,因为该查询不会锁定表。 But I cannot be sure, and ask to you. 但是我不能确定,然后问你。

Note that, the columns in the select list are NOT indexed on table. 请注意,选择列表中的列未在表上建立索引。

SET @END_DATE = GETDATE()

SET @START_DATE = DATEADD(HOUR, -24, @END_DATE) 

SELECT Column1, Column2, Column3, Column4, COUNT(*) FROM [TRANSACTION] WITH(NOLOCK)
WHERE TRANSACTION_DATE BETWEEN @START_DATE AND @END_DATE
GROUP BY Column1, Column2, Column3, Column4

Running any query on the server will use CPU / Memory / IO, so in essence anything you run can have an impact on other queries being executed. 在服务器上运行任何查询都将使用CPU /内存/ IO,因此从本质上讲,您运行的任何内容都会对正在执行的其他查询产生影响。

You are definately going to read in ~500k rows from your own figures, the row size you could calculate and you could even get a rough idea of how many pages this data would therefore be stored on. 您一定要从自己的图中读取〜500k行,您可以计算出行的大小,甚至可以大致了解此数据将存储在多少页上。 You would have to cross check against the query plan to make sure it was at least not doing a full partition scan, otherwise it would be 3.5 million rows scanned into memory. 您必须对查询计划进行交叉检查,以确保它至少没有进行完整的分区扫描,否则将有350万行扫描到内存中。

Will that put you outside of your SLAs? 这会使您脱离SLA吗? we have no way of telling that, only you can determine that through suitable load testing. 我们没有办法告诉您,只有您可以通过适当的负载测试确定这一点。

Obviously it WILL more or less slow down all the operations on the server. 显然,这或多或少都在服务器上的操作减慢。

The only queries which will be locked while your query lasts is schema change queries against your table. 持续查询期间唯一被锁定的查询是针对表的架构更改查询。

Personally I recomment you to create index on columns Column1, Column2, Column3, Column4, Transaction_date to run grouping faster, like this: 我个人建议您在Column1,Column2,Column3,Column4,Transaction_date列上创建索引,以更快地运行分组,如下所示:

CREATE INDEX iName on [TRANSACTION](Column1, Column2, Column3, Column4, Transaction_date) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM