简体   繁体   English

SQL查询优化:从非聚集索引中获取额外的列数据

[英]SQL query optimization: get extra column data from non-clustered index

I am trying to write a query on a table that receives millions of records per day.我正在尝试对每天接收数百万条记录的表编写查询。 I can narrow my query down to a time slice ( logdate ), but I need additional column data from it ( num ).我可以将查询范围缩小到一个时间片 ( logdate ),但我需要从中获取其他列数据 ( num )。 Here is a sample query I'm using to test it:这是我用来测试它的示例查询:

DECLARE @StartTimeStamp DATETIME = '12/6/2019 7:56:50.799'
DECLARE @EndTimeStamp DATETIME = '12/6/2019 7:56:50.8'

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT tx.num, tx.logdate 
FROM hsi.transactionxlog tx
WHERE tx.logdate BETWEEN @StartTimeStamp AND @EndTimeStamp

This particular test, with a time span of .001 seconds, takes over four minutes to run.此特定测试的时间跨度为 0.001 秒,运行时间超过 4 分钟。 If I change it to a timeframe with no records found in the specified timeframe, then it takes almost perhaps one second to run, even specifying a span of 24 hours.如果我将其更改为在指定时间范围内未找到任何记录的时间范围,那么运行几乎可能需要一秒钟,甚至指定 24 小时的跨度。

This table only has non-clustered indexes.该表只有非聚集索引。 One such index has the following columns in it: ( num , logdate , and action , in that order).一个这样的索引在其中包含以下列:( numlogdateactionlogdate顺序)。

How can I find the num corresponding to each record between @StartTimeStamp and @EndTimeStamp quickly?如何快速找到@StartTimeStamp 和@EndTimeStamp 之间每条记录对应的num I would strongly prefer not to create additional indexes on this table, since many other application use it so often.我强烈不想在这个表上创建额外的索引,因为许多其他应用程序经常使用它。

For this query:对于此查询:

select tx.num, tx.logdate
from hsi.transactionxlog tx
where tx.logdate BETWEEN @StartTimeStamp AND @EndTimeStamp;

The optimal index is: transactionxlog(logdate, num) .最佳索引是: transactionxlog(logdate, num) The logdate should be the first key in the index so it is used for the where condition. logdate应该是索引中的第一个键,因此它用于where条件。

I found a temp table solution.我找到了一个临时表解决方案。 Here is the essence of the solution:这是解决方案的本质:

DECLARE @StartTimeStamp DATETIME = '12/6/2019 7:56:50.799'
DECLARE @EndTimeStamp DATETIME = '12/6/2019 7:56:50.8'
DECLARE @TempTable TABLE (logdate DATETIME, action BIGINT)

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
INSERT INTO @TempTable
    SELECT logdate, action FROM hsi.transactionxlog
    WHERE logdate BETWEEN @StartTimeStamp AND @EndTimeStamp

SELECT tx.num, tx.logdate 
FROM hsi.transactionxlog tx
INNER JOIN @TempTable t ON t.logdate = tx.logdate AND t.action = tx.action
WHERE tx.logdate BETWEEN @StartTimeStamp AND @EndTimeStamp

I don't really have a good explanation for why this works, but it is much faster, and time scales properly with the time difference between @StartTimeStamp and @EndTimeStamp .我真的没有很好的解释为什么它会起作用,但它要快得多,并且时间可以根据@StartTimeStamp@EndTimeStamp之间的@StartTimeStamp进行@EndTimeStamp It simply selects the few thousand records, then for some reason it's easier for SQL to find them in the large table.它只是选择了几千条记录,然后出于某种原因,SQL 更容易在大表中找到它们。

Thank you for looking at the question and trying to answer.感谢您查看问题并尝试回答。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM