How to optimize performance of update query?

Question

I'm trying to optimize the performance of the following update query:

UPDATE a
 SET a.[qty] =
      (
       SELECT MAX(b.[qty])
       FROM [TableA] AS b
       WHERE b.[ID] = a.[ID]
         AND b.[Date] = a.[Date]
         AND b.[qty] <> 0
      )
 FROM [TableA] a
 WHERE a.[qty] = 0
  AND a.[status] = 'New'

It deals with a large table with over 200m. rows.

I've already tried to create an index on [qty,status], but it was not really helpfull due to the index update at the end. Generally it is not so easy to create indexes on this table, cause there are a lot other update/insert-queries. So I'm think to reorganize this query somehow. Any ideas?

TableA is a heap like this:

CREATE TABLE TableA (
    ID            INTEGER       null,
    qty           INTEGER       null,
    date          date          null,
    status        VARCHAR(50)   null,
);

Execution plan: https://www.brentozar.com/pastetheplan/?id=S1KLUWO15

Answer 1

It's difficult to answer without seeing execution plans and table definitions, but you can avoid self-joining by using an updatable CTE/derived table with window functions

UPDATE a
SET
  qty = a.maxQty
FROM (
    SELECT *,
      MAX(CASE WHEN a.qty <> 0 THEN a.qty END) OVER (PARTITION BY a.ID, a.Date) AS maxQty
    FROM [TableA] a
) a
WHERE a.qty = 0
  AND a.status = 'New';

To support this query, you will need the following index

TableA (ID, Date) INCLUDE (qty, status)

The two key columns can be in either order, and if you do a clustered index then the INCLUDE columns are included automatically.

How to optimize performance of update query?

Question

1 answers

solution1
0 2022-02-14 16:23:58

How to optimize performance of update query?

Question

1 answers

solution1 0 2022-02-14 16:23:58

solution1
0 2022-02-14 16:23:58