Optimize/Alternatives to this self referencing update query

Question

I have this update query:

UPDATE aggregate_usage_input t 
       JOIN (SELECT t2.id 
             FROM   aggregate_usage_input t2 
             WHERE  t2.is_excluded_total_gallons = 0 
                    AND t2.is_excluded_cohort = 0 
                    AND t2.is_excluded_outlier = 0 
             ORDER  BY t2.occupant_bucket_id, 
                       t2.residence_type_bucket_id, 
                       t2.reading_year, 
                       t2.nthreading, 
                       t2.total_gallons)t_sorted 
         ON t_sorted.id = t.id 
SET    t.rownum = @rownum := @rownum + 1

which updates an the rownum field (which actually is an order by field) based on the sorts.

The select query takes 9 seconds, and since we use order by it's acceptable.

The update part of this query takes a very long time. Over 5 minutes on a 400.000 record table. We need to reduce this under a minute or so.

How to speed it up, or do you have some alternate way to resolve this issue?

Answer 1

The subquery will be slowing you down here. In practise I've noticed that separating the subquery out into a temporary table or table variable is faster.

Try:

CREATE TEMPORARY TABLE Temp (id int);

INSERT INTO Temp
SELECT t2.id 
FROM   aggregate_usage_input t2 
WHERE  t2.is_excluded_total_gallons = 0 
    AND t2.is_excluded_cohort = 0 
    AND t2.is_excluded_outlier = 0 
ORDER  BY t2.occupant_bucket_id, 
    t2.residence_type_bucket_id, 
    t2.reading_year, 
    t2.nthreading, 
    t2.total_gallons;

UPDATE aggregate_usage_input t
JOIN Temp t_sorted
     ON t_sorted.id = t.id 
SET t.rownum = @rownum := @rownum + 1

Optimize/Alternatives to this self referencing update query

Question

1 answers

solution1
0 ACCPTED 2012-03-07 09:53:08

Optimize/Alternatives to this self referencing update query

Question

1 answers

solution1 0 ACCPTED 2012-03-07 09:53:08

solution1
0 ACCPTED 2012-03-07 09:53:08