简体   繁体   中英

Performance on an update/delete query

I want to update my table with values generated out of the same table.

My goal is to search all rows with proftxt like _NS and _WP and with the same ao, sum them up,
divide the value through the number of _H,_G,_L-Elements of that ao and add this value to the _H, _G and _L objects of that ao.

It is possible that there are only _NS and _WP rows for an ao. Than the routine should jump over this ao.

Example:

My data looks like:

an, ao, proftxt, value, year  
101 , 1, 'e_NSe', 5, 2006  
102 , 1, 'e_Ha', 1, 2006  
103 , 1, 'w_NSr', 4, 2006  

104 , 2, 'w_NSr', 2, 2006  
105 , 2, 'x_H05r', 4, 2006   
106 , 2, 'w_Gr', 2, 2006   
107 , 2, 'a_WPr', 4, 2006 

108 , 3, 'a_WPr', 4, 2006 

My data should be like:

an, ao, proftxt, value, year  
102 , 1, 'e_Ha', 10 2006  

103 , 2, 'x_H05r', 7, 2006  
103 , 2, 'w_Gr', 5, 2006  

108 , 3, 'a_WPr', 4, 2006  

My routine works for a small amount of testdata.

The update function ends, while working on the real database, after 13 hours successful.
But it only edited 5000 out of 210000 rows.

DECLARE @ENDYEAR INT
DECLARE @AO BIGINT
DECLARE @YEAR INT
DECLARE @ELEMENTS INT

--Parameter festlegen
SET @YEAR = 2006
SET @ENDYEAR = 2013 --Endyear+1  
SET @AO = 2  

WHILE(@YEAR<@ENDYEAR)
BEGIN     
WHILE (@AO >1)  --Do as long as Cursor is inside table
BEGIN  
    SET @AO = (SELECT TOP 1 ao FROM tbl_slp -- Search ao with _WP _NS
                 WHERE (proftxt LIKE '%[_]WP%' 
                         OR proftxt LIKE '%[_]NS%') 
                         AND year = @YEAR 
                         AND ao > @AO );

    SET @ELEMENTS = (SELECT COUNT(proftxt) --Count Number of _H, _G, _L elements
                            FROM tbl_SLP  
                            WHERE ao = @AO AND year = @YEAR AND  
                                (proftxt LIKE '%[_]H%' OR proftxt = NULL
                                  OR proftxt LIKE '%[_]G%'
                                  OR proftxt LIKE '%[_]L%'))

    IF (@ELEMENTS != 0)
    BEGIN
        UPDATE tbl_SLP --Update _H, _G, _L rows
        SET value = value + (SELECT SUM(CONVERT(float, value)) 
                            FROM tbl_SLP
                            WHERE (proftxt LIKE '%[_]WP%'
                                                            OR proftxt LIKE '%[_]NS%')
                                                            AND year = @YEAR 
                                                            AND ao = @AO)
                            /@ELEMENTS
        WHERE ao = @AO AND year = @YEAR 


        DELETE FROM tbl_SLP --delete_WP _NS rows
            WHERE ao= @AO 
                  AND year = @YEAR 
                  AND (proftxt LIKE '%[_]WP%' OR proftxt LIKE '%[_]NS%')

    END

    SET @AO = @AO +1
    END
    SET @YEAR = @YEAR +1
END

I know that the routine is super slow, but what can I do?

SQL is designed for set-based operations, not procedural flow-of-control style logic like your routine. Here's a set-based way of doing it, which I'm guessing will be much faster than the procedural way:

SET XACT_ABORT ON
SET NOCOUNT ON

BEGIN TRANSACTION

-- Create a temp table with each ao-year's sums and counts (sums of _NS and _WP record values and counts of _H, _G, and _L records)
SELECT T.ao, T.year, SUM(T.value) AS SumVals, (SELECT COUNT(*) FROM tbl_slp A WHERE A.ao = T.ao AND A.year = T.year AND (A.proftxt = NULL OR A.proftxt LIKE '%[_]H%' OR A.proftxt LIKE '%[_]G%' OR A.proftxt LIKE '%[_]L%')) AS CountOther
INTO #temp1 
FROM tbl_slp T
WHERE (T.proftxt LIKE '%[_]WP%' OR T.proftxt LIKE '%[_]NS%')
GROUP BY T.ao, T.year

-- Add "sum/count" for each ao-year to the _H, _G, and _L records for that year
UPDATE A
SET value = value + CONVERT(FLOAT, T.SumVals) / T.CountOther
FROM tbl_slp A 
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftxt = NULL OR A.proftxt LIKE '%[_]H%' OR A.proftxt LIKE '%[_]G%' OR A.proftxt LIKE '%[_]L%')

-- Now that we've distributed the _WP and _NS values, delete those records
DELETE A
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftxt LIKE '%[_]WP%' OR A.proftxt LIKE '%[_]NS%')
 AND T.CountOther > 0

COMMIT TRANSACTION

For the sample set you gave, this produces the exact same results (except for the an column which I assume was a typo).

Full disclosure, this takes longer on the sample set than your routine does (17 ms compared to your 3 ms), but it should scale up to large data a whole lot better. I put it in a transaction for correctness but I'm not sure what your exact use case is, so that may be a disadvantage of my way since it will lock the pages (and may escalate to the whole table) for the entire time. Your routine didn't have any transactions, though, which could lead to bad data so if you keep your way make sure to put each update-delete pair in its own transaction.

Also, if you don't have an index on proftxt , add one! This will make a huge difference for both solutions.

Good luck. Here's the SQL Fiddle I used .

First, I see a couple of NULL-related problems. For instance, your inner loop is apparently waiting for @AO to become NULL before it will finish:

WHILE (@AO >1)

This will work when you set @AO to something that isn't there, but it is hard to read, and you probably want to write more explicit logic.

Next, this condition will always be false:

OR proftxt = NULL

The NULL value is not equal to itself. To test for this condition you would have to write:

OR proftxt IS NULL

Also, any NULL values will be omitted from your COUNT(proftxt). Try running the following sample query. It returns 1, along with the message "Warning: Null value is eliminated by an aggregate or other SET operation".

SELECT COUNT(fieldname) FROM (SELECT 1 AS fieldname UNION SELECT NULL AS fieldname) AS tablename

Finally, indexing the proftxt column won't fix your performance problems , because a LIKE condition with a leading wildcard can't use the index. You can think of an index like a telephone book that is alphabetized by last name. If you are looking for LastName LIKE '%mann', the index won't help you. You will still have to read through every entry in the telephone book to find all the last names ending in "mann". In database terms, that is called a "table scan", and is slow.

I would add a new column, which you could call proftxttype.

UPDATE tbl_SLP
SET proftxttype = 1
WHERE proftxt LIKE '%[_]WP%' 
OR proftxt LIKE '%[_]NS%'

UPDATE tbl_SLP
SET proftxttype = 2
WHERE proftxt LIKE '%[_]H%'
OR proftxt LIKE '%[_]G%'
OR proftxt LIKE '%[_]L%'
OR proftxt IS NULL

Then index this column:

CREATE NONCLUSTERED INDEX [IX_PROFTXTTYPE] ON [dbo].[TBL_SLP] (PROFTXTTYPE ASC) ON [PRIMARY]

Now rewrite your update in terms of proftxttype. Of course, whenever you insert or update proftxt, you will also have to update proftxttype. That is inavoidable, but SQL Server will take care of keeping the index up to date, so you don't have to worry about the index.

I know this sounds like a lot of work, but the core of your problem is that you're scanning through the entire table every time you want to find a proftxt value with a leading wildcard.

I combined both (really helpful!) answers. I added, as criticalfix told me, a coloum proftype to set an index on the table:

ALTER TABLE
ADD proftype CHAR(1)
GO

UPDATE tbl_SLPverrechnetWPNSP 
SET proftype = 'W'
WHERE proftxt LIKE '%[_]WP%' 

UPDATE tbl_SLP 
SET proftype = 'N'
WHERE proftxt LIKE '%[_]NS%'

UPDATE tbl_SLP
SET proftype = 'H'
WHERE proftxt LIKE '%[_]H%'
    OR proftxt IS NULL

UPDATE tbl_SLP 
SET proftype = 'G'
WHERE proftxt LIKE '%[_]G%'

UPDATE tbl_SLP
SET proftype = 'L'
WHERE proftxt LIKE '%[_]L%'

--set index on proftype
CREATE NONCLUSTERED INDEX [IX_PROFTYPE] ON [dbo].[tbl_SLP] (proftype ASC) ON [PRIMARY]
GO

Next I used the code from bob to edit my table.

SET XACT_ABORT ON
SET NOCOUNT ON

BEGIN TRANSACTION

-- Create a temp table with each ao-year's sums and counts (sums of N and W record values and counts of H, G, and L records)
SELECT T.ao, T.year, SUM(CONVERT(float, T.value)) AS SumVals, (SELECT COUNT(*) 
                                                FROM tbl_slp A 
                                                WHERE A.ao = T.ao 
                                                    AND A.year = T.year 
                                                    AND (A.proftype ='G' OR A.proftype = 'H' OR A.proftype = 'L' )) 
                                                AS CountOther
INTO #temp1 
FROM tbl_slp T
WHERE (T.proftype = 'W' OR T.proftype = 'N')
GROUP BY T.ao, T.year

-- Add "sum/count" for each ao-year to the H, G, and L records for that year
UPDATE A
SET value = value + CONVERT(FLOAT, T.SumVals) / T.CountOther
FROM tbl_slp A 
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftype = 'H' OR A.proftype = 'G' OR A.proftype LIKE 'L')

-- Now that we've distributed the W and N values, delete those records
DELETE A
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftype = 'W' OR A.proftype = 'N')
 AND T.CountOther > 0

 DROP TABLE #temp1

COMMIT TRANSACTION

Thank you so much for the help! The routine ran only 3,5 minutes!!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM