简体   繁体   中英

Optimize/Alternatives to this self referencing update query

I have this update query:

UPDATE aggregate_usage_input t 
       JOIN (SELECT t2.id 
             FROM   aggregate_usage_input t2 
             WHERE  t2.is_excluded_total_gallons = 0 
                    AND t2.is_excluded_cohort = 0 
                    AND t2.is_excluded_outlier = 0 
             ORDER  BY t2.occupant_bucket_id, 
                       t2.residence_type_bucket_id, 
                       t2.reading_year, 
                       t2.nthreading, 
                       t2.total_gallons)t_sorted 
         ON t_sorted.id = t.id 
SET    t.rownum = @rownum := @rownum + 1 

which updates an the rownum field (which actually is an order by field) based on the sorts.

The select query takes 9 seconds, and since we use order by it's acceptable.

The update part of this query takes a very long time. Over 5 minutes on a 400.000 record table. We need to reduce this under a minute or so.

How to speed it up, or do you have some alternate way to resolve this issue?

The subquery will be slowing you down here. In practise I've noticed that separating the subquery out into a temporary table or table variable is faster.

Try:

CREATE TEMPORARY TABLE Temp (id int);

INSERT INTO Temp
SELECT t2.id 
FROM   aggregate_usage_input t2 
WHERE  t2.is_excluded_total_gallons = 0 
    AND t2.is_excluded_cohort = 0 
    AND t2.is_excluded_outlier = 0 
ORDER  BY t2.occupant_bucket_id, 
    t2.residence_type_bucket_id, 
    t2.reading_year, 
    t2.nthreading, 
    t2.total_gallons;

UPDATE aggregate_usage_input t
JOIN Temp t_sorted
     ON t_sorted.id = t.id 
SET t.rownum = @rownum := @rownum + 1 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM