I have a problem with a SQL database query that suddenly (but regularly about every three weeks) becomes slow.
Setup is the following:
Orders
) the query is mainly selecting from has around 24000 records, five other joined tables are small (100 records or less) Orders
has a varbinary(MAX)
column Report
that contains binary data (PDF documents) with an average size of about 200 to 300 kB (but can be up to 2 MB occasionally). More than 90% of those 24000 orders have this column filled, for the others it is NULL
, ie more than 90% of the 6 GB database size are binary data. The query in question has the following structure:
SELECT TOP (30) [Project2].[OrderID] AS [OrderID]
-- around 20 columns more
FROM ( SELECT [Project2].[OrderID] AS [OrderID],
-- around 20 columns more
row_number() OVER (ORDER BY [Project2].[OrderID] ASC) AS [row_number]
FROM ( SELECT [Filter1].[OrderID] AS [OrderID]
-- around 20 columns more
FROM ( SELECT [Extent1].[OrderID] AS [OrderID]
-- around 20 columns more
FROM [dbo].[Orders] AS [Extent1]
INNER JOIN -- small table
LEFT OUTER JOIN -- small table
LEFT OUTER JOIN -- small table
LEFT OUTER JOIN -- small table
LEFT OUTER JOIN -- small table
WHERE ([Extent1].[Status] IS NOT NULL)
AND (4 = CAST( [Extent1].[Status] AS int))
AND ([Extent1].[SomeDateTime] IS NULL)
AND ([Extent1].[Report] IS NULL)
) AS [Filter1]
OUTER APPLY (SELECT TOP (1) [Project1].[C1] AS [C1]
FROM ( SELECT CAST( [Extent7].[CreationDateTime] AS datetime2) AS [C1],
[Extent7].[CreationDateTime] AS [CreationDateTime]
FROM [dbo].[OtherTable] AS [Extent7]
WHERE [Filter1].[OrderID] = [Extent7].[OrderID]
) AS [Project1]
ORDER BY [Project1].[CreationDateTime] DESC
) AS [Limit1]
) AS [Project2]
) AS [Project2]
WHERE [Project2].[row_number] > 0
ORDER BY [Project2].[OrderID] ASC
It is generated from a LINQ-to-Entities query by Entity Framework. The query occurs in a few variations which are only different in the first WHERE
clause:
The five variants
WHERE ([Extent1].[Status] IS NOT NULL) AND (X = CAST( [Extent1].[Status] AS int))
X can be between 0
and 4
. These queries are never a problem.
And the two variants (*)
WHERE ([Extent1].[Status] IS NOT NULL) AND (4 = CAST( [Extent1].[Status] AS int)) AND ([Extent1].[SomeDateTime] IS NULL) AND ([Extent1].[Report] IS NULL)
or ... IS NOT NULL...
in the last line. I have the problem described below only with those two queries.
The "phenomenon" is:
An additional observation:
Somehow I suspect the whole problem has to do with the memory limitation (1 GB) of the Express edition and the varbinary(MAX)
column although I just use it in the WHERE
clause that checks if the column value is NULL
or not NULL
. The Report
column itself is not one of the selected columns.
Since I am running against the limitations (10 GB mdf file size) of the Express edition next year latest I am considering changes anyway:
Orders
table Question: What could be the reason that the query is suddenly slow? Could one of the changes I am planning solve the problem or are there other solutions?
Edit
Following bhamby's tip in the comments below I've set SET STATISTICS TIME ON
in SSMS before running the query again. When the query is slow again I get a high value for SQL Server parse and compile time
, namely: CPU time = 27,3 sec
and Elapsed time = 81,9 sec
. The execution time for the query is only CPU time = 0,06 sec and Elapsed time = 2,8 sec. Running the query a second time after that gives CPU time 0,06 sec and Elapsed time = 0,08 for the SQL Server parse and compile time.
This just seems wasteful
SELECT TOP (1) [Project1].[C1] AS [C1]
FROM ( SELECT CAST( [Extent7].[CreationDateTime] AS datetime2) AS [C1],
[Extent7].[CreationDateTime] AS [CreationDateTime]
FROM [dbo].[OtherTable] AS [Extent7]
WHERE [Filter1].[OrderID] = [Extent7].[OrderID]
) AS [Project1]
ORDER BY [Project1].[CreationDateTime] DESC
is
SELECT max( CAST( [Extent7].[CreationDateTime] AS datetime2) ) AS [C1]
FROM [dbo].[OtherTable] AS [Extent7]
WHERE [Filter1].[OrderID] = [Extent7].[OrderID]
Why are you not storing dates as datetime?
I don't like that outer apply
I would create a #temp that is run once and join to it
Make sure and declare [OrderID] as PK
SELECT [Extent7].[OrderID], max( CAST( [Extent7].[CreationDateTime] AS datetime2) ) AS [C1]
FROM [dbo].[OtherTable] AS [Extent7]
GROUP BY [Extent7].[OrderID]
You could have loop join going on
Next I would put this in #temp2 so that you are sure it is only run once
Again be sure to declare OrderID as a PK
SELECT [Extent1].[OrderID] AS [OrderID]
-- around 20 columns more
FROM [dbo].[Orders] AS [Extent1]
INNER JOIN -- small table
LEFT OUTER JOIN -- small table
LEFT OUTER JOIN -- small table
LEFT OUTER JOIN -- small table
LEFT OUTER JOIN -- small table
WHERE ([Extent1].[Status] IS NOT NULL)
AND (4 = CAST( [Extent1].[Status] AS int))
AND ([Extent1].[SomeDateTime] IS NULL)
AND ([Extent1].[Report] IS NULL)
If Order is only 24,000 rows then something stupid is going on for you to have queries more than a few seconds.
If it is a query which is run often, then I would suggest turning it into a Stored Procedure and using the results of the procedure.
In Entity Framework you should be able to import the procedure as a Function Import .
You can then take control of the Stored Procedure's execution plan by giving it query hints or combating Parameter Sniffing .
It smells like your server's execution plans are going out-of-date every 3 weeks, hence the slow-down.
Also, you mentioned that you are using 64bit SQL. My experience has been that 64bit SQL does not tend to perform very efficiently with sub-queries. My advice would be to try and avoid them.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.