I'm trying to optimize the performance of the following update query:
UPDATE a
SET a.[qty] =
(
SELECT MAX(b.[qty])
FROM [TableA] AS b
WHERE b.[ID] = a.[ID]
AND b.[Date] = a.[Date]
AND b.[qty] <> 0
)
FROM [TableA] a
WHERE a.[qty] = 0
AND a.[status] = 'New'
It deals with a large table with over 200m. rows.
I've already tried to create an index on [qty,status], but it was not really helpfull due to the index update at the end. Generally it is not so easy to create indexes on this table, cause there are a lot other update/insert-queries. So I'm think to reorganize this query somehow. Any ideas?
TableA is a heap like this:
CREATE TABLE TableA (
ID INTEGER null,
qty INTEGER null,
date date null,
status VARCHAR(50) null,
);
Execution plan: https://www.brentozar.com/pastetheplan/?id=S1KLUWO15
It's difficult to answer without seeing execution plans and table definitions, but you can avoid self-joining by using an updatable CTE/derived table with window functions
UPDATE a
SET
qty = a.maxQty
FROM (
SELECT *,
MAX(CASE WHEN a.qty <> 0 THEN a.qty END) OVER (PARTITION BY a.ID, a.Date) AS maxQty
FROM [TableA] a
) a
WHERE a.qty = 0
AND a.status = 'New';
To support this query, you will need the following index
TableA (ID, Date) INCLUDE (qty, status)
The two key columns can be in either order, and if you do a clustered index then the INCLUDE
columns are included automatically.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.