简体   繁体   中英

Optimizing SQL Server Query (CTE)

This query takes forever to run. Does anybody have any good tips on how I can optimize it?

WITH CTE (Lockindate, Before5, After5) AS (SELECT nl.Lockindate, 
(CASE WHEN CAST(RIGHT(FirstLockActivity,8) AS time(1)) <= '17:00' THEN
'Before 5 PM' END) AS before5,
(CASE WHEN CAST(RIGHT(FirstLockActivity,8) AS time(1)) >= '17:00' THEN
'After 5 PM' END) AS after5
FROM netlock nl WITH(NOLOCK)
JOIN rate rs WITH(NOLOCK)
ON nl.id=rs.id
WHERE nl.lockindate BETWEEN '2016-08-01' AND '2016-08-31')
SELECT lockindate, COUNT(After5), COUNT(Before5)
FROM CTE
GROUP BY lockindate

You don't need a CTE:

SELECT nl.Lockindate, 
   SUM(CASE WHEN CAST(RIGHT(FirstLockActivity,8) AS time(1)) <= '17:00' THEN 1 ELSE 0 END) AS before5,
   SUM(CASE WHEN CAST(RIGHT(FirstLockActivity,8) AS time(1)) > '17:00' THEN 1 ELSE 0 END) AS after5
FROM netlock nl WITH(NOLOCK)
JOIN rate rs WITH(NOLOCK)
ON nl.id=rs.id
WHERE nl.lockindate BETWEEN '2016-08-01' AND '2016-08-31'
GROUP BY nk.lockindate

But this might result in exactly the same plan. Then you need to check why it's slow (missing indexes on the join/group by columns?)

Although it won't speed up things all that much IN THIS CASE (*), the conversions from one datatype to another and then another again are 'not optimal'.

(CASE WHEN CAST(RIGHT(FirstLockActivity,8) AS time(1)) <= '17:00' THEN 'Before 5 PM' END) AS before5,

First of all you do an implicit conversion from a datetime [FirstLockActivity] to a string. The reason for this being that the Right() function expects a string, and hence the conversion. Doing implicit conversions can be dangerous. Depending on the configuration of your server (and even of your connection which by itself might be influenced by the regional settings of your operating system) this can result in surprising results, not all of which will have the expected last 8 characters!

FYI: have a look here: https://msdn.microsoft.com/en-us/library/ms187928.aspx?f=255&MSPPError=-2147217396 . As you can't explicitly pass the 'style' with CAST I've always suggested to people to rather use Convert(), ESPECIALLY when converting from datetimes to string, and vice versa.

After that you take the right-most 8 characters and then convert these into a time(1). I'm not sure why you want to use a time(1) over a time(0) since you're only interested in the hour part, but in the end it won't make much difference I guess.

Anyway, doing all these conversions takes CPU and thus time.

Assuming this thing isn't too dumbed down from what you really want to do, the target of the query is to return an indication on how many entries there were before and after 5 PM for each lockindate. As such, all you need to do is calculate the hour of each FirstLockActivy and decide from there. => Hour(xx) and DatePart(hour, xx) will both return the required information for a fraction of the CPU-cost of those conversions. Also, you can easily get the before/after in one CASE construction.

  WITH CTE (LockinDate, Before5PM)
    AS (SELECT Lockindate, 
               (CASE WHEN Hour(FirstLockActivity) < 17 THEN 1 ELSE 0 END) AS Before5PM

          FROM netlock nl WITH (NOLOCK)
          JOIN rate rs WITH (NOLOCK)
            ON nl.id=rs.id
         WHERE nl.lockindate BETWEEN Convert(datetime, '2016-08-01', 105) AND Convert(datetime, '2016-08-31', 105))
  SELECT LockinDate,
         After5  = SUM(1 - Before5PM),
         Before5 = SUM(Before5PM)  
    FROM CTE
   GROUP BY LockinDate

Assuming the amount of (relevant) records in the tables is huge, then this will have some effect on the duration, but given the speed of modern processors it probably won't be shocking. Off course, when you're on a busy server and there isn't all that much (free) CPU going around, then the effect will be much more noticable.

That said, as for performance I'd suggest to check the indexes on the netlock and rate table . Ideally netlock has a clustered index (or PK) on the id field and a non-clustered index on lockindate . Additionally, the rate table has a clustered index on the id field with some additional other fields I'm not aware of. If the latter isn't the case, having a non-clustered index on the id field with the FirstLockActivity field in the included column would be great too.

If you want to have this query without the CTE, you can simply copy paste the CTE into a subquery/derived table like this:

 SELECT LockinDate,
        After5  = SUM(1 - Before5PM),
        Before5 = SUM(Before5PM)  
   FROM (SELECT Lockindate, 
                (CASE WHEN Hour(FirstLockActivity) < 17 THEN 1 ELSE 0 END) AS Before5PM

           FROM netlock nl WITH (NOLOCK)
           JOIN rate rs WITH (NOLOCK)
             ON nl.id=rs.id
          WHERE nl.lockindate BETWEEN Convert(datetime, '2016-08-01', 105) AND Convert(datetime, '2016-08-31', 105)) A
  GROUP BY LockinDate

Or, worked out a little more you'd get

 SELECT LockinDate,
        After5  = SUM(1 - (CASE WHEN Hour(FirstLockActivity) < 17 THEN 1 ELSE 0 END)),
        Before5 = SUM(    (CASE WHEN Hour(FirstLockActivity) < 17 THEN 1 ELSE 0 END))
   FROM netlock nl WITH (NOLOCK)
   JOIN rate rs WITH (NOLOCK)
     ON nl.id=rs.id
  WHERE nl.lockindate BETWEEN Convert(datetime, '2016-08-01', 105) AND Convert(datetime, '2016-08-31', 105)) A
  GROUP BY LockinDate

PS: wrote all this in the browser, there might be some typo's and none the code is tested, prepare to have to fiddle around a bit to get it working =)

PS: if you can't get the query plan, you might still be able to use SET STATISTICS TIME to more easily compare one version of the query to another.

(*: In case you do this kind of conversions inside the WHERE clause or a JOIN clause, it will confuse the optimizer and the results could be devastating for the performance of your query)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM