简体   繁体   中英

SQL Server - wrong execution plan?

I have a very big table with a lot of rows and a lot of columns (I know it's bad but let's leave this aside).

Specifically, I had two columns - FinishTime, JobId . The first one is the finish time of the row and the second is its id (not unique, but almost unique - only few records exist with the same id).

I have index on jobid and index on finishtime.

We insert rows all the time, mostly ordered by the finish time. We also update statistics of each index periodically.

Now to the problem:

When I run query with filter jobid==<some id> AND finishtime > <now minus 1 hour> - this query takes a lot of time, and when showing the estimated execution plan I see that the plan is to go over the finishtime index, even though going over the jobid index should be a lot better. When looking at the index statistics, I see that the server "thinks" that the number of jobs in the last hour is 1 because we didn't update the statistics of this index.

When I run query with filter jobid==<some id> AND finishtime > <now minus 100 days> - this works great, because the SQL Server knows to go over the correct index - the job id index.

So basically my question is why if we don't update index statistics all the time (which is time consuming), the server assumes that the number of records past the last bucket is 1?

Thanks very much

You can get a histogram of what the statistics contains for an index using DBCC SHOW_STATISTICS , eg

DBCC SHOW_STATISTICS ( mytablename , myindexname ) 

For date-based records, queries will always be prone to incorrect statistics. Running this should show that the last bucket in the histogram has barely any records in the range [prior-to-today / after-today]. However, all else being equal, SQL Server should still prefer the job_id index to the finishtime index if both are single-column indexes with no included columns; this is due to job_id (int) being faster to lookup than finishtime (datetime).

Note: If your finishtime is covering for the query, this will heavily influence the query optimizer in selecting it since it eliminates a bookmark lookup operation.

To combat this, either

  1. update statistics regularly
  2. create multiple filtered indexes (2008+ feature) on the data, with one partition updated far more regularly being the tail end
  3. use index hints on sensitive queries

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM