简体   繁体   中英

How to optimize performance of update query?

I'm trying to optimize the performance of the following update query:

UPDATE a
 SET a.[qty] =
      (
       SELECT MAX(b.[qty])
       FROM [TableA] AS b
       WHERE b.[ID] = a.[ID]
         AND b.[Date] = a.[Date]
         AND b.[qty] <> 0
      )
 FROM [TableA] a
 WHERE a.[qty] = 0
  AND a.[status] = 'New'

It deals with a large table with over 200m. rows.

I've already tried to create an index on [qty,status], but it was not really helpfull due to the index update at the end. Generally it is not so easy to create indexes on this table, cause there are a lot other update/insert-queries. So I'm think to reorganize this query somehow. Any ideas?

TableA is a heap like this:

CREATE TABLE TableA (
    ID            INTEGER       null,
    qty           INTEGER       null,
    date          date          null,
    status        VARCHAR(50)   null,
);

Execution plan: https://www.brentozar.com/pastetheplan/?id=S1KLUWO15

It's difficult to answer without seeing execution plans and table definitions, but you can avoid self-joining by using an updatable CTE/derived table with window functions

UPDATE a
SET
  qty = a.maxQty
FROM (
    SELECT *,
      MAX(CASE WHEN a.qty <> 0 THEN a.qty END) OVER (PARTITION BY a.ID, a.Date) AS maxQty
    FROM [TableA] a
) a
WHERE a.qty = 0
  AND a.status = 'New';

To support this query, you will need the following index

TableA (ID, Date) INCLUDE (qty, status)

The two key columns can be in either order, and if you do a clustered index then the INCLUDE columns are included automatically.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM