更新/删除查询的性能

Question

I want to update my table with values generated out of the same table. 我想用同一张表中生成的值更新我的表。

My goal is to search all rows with proftxt like _NS and _WP and with the same ao, sum them up, 我的目标是搜索所有带有proftxt的行，例如_NS和_WP，并使用相同的ao，将它们汇总，
divide the value through the number of _H,_G,_L-Elements of that ao and add this value to the _H, _G and _L objects of that ao. 将值除以该ao的_H，_G，_L-Elements的数量，然后将此值添加到该ao的_H，_G和_L对象中。

It is possible that there are only _NS and _WP rows for an ao. ao可能只有_NS和_WP行。 Than the routine should jump over this ao. 比例行程序应该跳过这一点。

Example: 例：

My data looks like: 我的数据如下：

an, ao, proftxt, value, year  
101 , 1, 'e_NSe', 5, 2006  
102 , 1, 'e_Ha', 1, 2006  
103 , 1, 'w_NSr', 4, 2006  

104 , 2, 'w_NSr', 2, 2006  
105 , 2, 'x_H05r', 4, 2006   
106 , 2, 'w_Gr', 2, 2006   
107 , 2, 'a_WPr', 4, 2006 

108 , 3, 'a_WPr', 4, 2006

My data should be like: 我的数据应该像：

an, ao, proftxt, value, year  
102 , 1, 'e_Ha', 10 2006  

103 , 2, 'x_H05r', 7, 2006  
103 , 2, 'w_Gr', 5, 2006  

108 , 3, 'a_WPr', 4, 2006

My routine works for a small amount of testdata. 我的例程适用于少量的测试数据。

The update function ends, while working on the real database, after 13 hours successful. 成功使用13个小时后，更新功能将在实际数据库上运行时结束。
But it only edited 5000 out of 210000 rows. 但是它仅编辑了210000行中的5000行。

DECLARE @ENDYEAR INT
DECLARE @AO BIGINT
DECLARE @YEAR INT
DECLARE @ELEMENTS INT

--Parameter festlegen
SET @YEAR = 2006
SET @ENDYEAR = 2013 --Endyear+1  
SET @AO = 2  

WHILE(@YEAR<@ENDYEAR)
BEGIN     
WHILE (@AO >1)  --Do as long as Cursor is inside table
BEGIN  
    SET @AO = (SELECT TOP 1 ao FROM tbl_slp -- Search ao with _WP _NS
                 WHERE (proftxt LIKE '%[_]WP%' 
                         OR proftxt LIKE '%[_]NS%') 
                         AND year = @YEAR 
                         AND ao > @AO );

    SET @ELEMENTS = (SELECT COUNT(proftxt) --Count Number of _H, _G, _L elements
                            FROM tbl_SLP  
                            WHERE ao = @AO AND year = @YEAR AND  
                                (proftxt LIKE '%[_]H%' OR proftxt = NULL
                                  OR proftxt LIKE '%[_]G%'
                                  OR proftxt LIKE '%[_]L%'))

    IF (@ELEMENTS != 0)
    BEGIN
        UPDATE tbl_SLP --Update _H, _G, _L rows
        SET value = value + (SELECT SUM(CONVERT(float, value)) 
                            FROM tbl_SLP
                            WHERE (proftxt LIKE '%[_]WP%'
                                                            OR proftxt LIKE '%[_]NS%')
                                                            AND year = @YEAR 
                                                            AND ao = @AO)
                            /@ELEMENTS
        WHERE ao = @AO AND year = @YEAR 


        DELETE FROM tbl_SLP --delete_WP _NS rows
            WHERE ao= @AO 
                  AND year = @YEAR 
                  AND (proftxt LIKE '%[_]WP%' OR proftxt LIKE '%[_]NS%')

    END

    SET @AO = @AO +1
    END
    SET @YEAR = @YEAR +1
END

I know that the routine is super slow, but what can I do? 我知道例程非常慢，但是我该怎么办？

Answer 1

SQL is designed for set-based operations, not procedural flow-of-control style logic like your routine. SQL是为基于集合的操作而不是像例程一样的过程控制流样式逻辑而设计的。 Here's a set-based way of doing it, which I'm guessing will be much faster than the procedural way: 这是一种基于集合的方法，我想它将比过程方法快得多：

SET XACT_ABORT ON
SET NOCOUNT ON

BEGIN TRANSACTION

-- Create a temp table with each ao-year's sums and counts (sums of _NS and _WP record values and counts of _H, _G, and _L records)
SELECT T.ao, T.year, SUM(T.value) AS SumVals, (SELECT COUNT(*) FROM tbl_slp A WHERE A.ao = T.ao AND A.year = T.year AND (A.proftxt = NULL OR A.proftxt LIKE '%[_]H%' OR A.proftxt LIKE '%[_]G%' OR A.proftxt LIKE '%[_]L%')) AS CountOther
INTO #temp1 
FROM tbl_slp T
WHERE (T.proftxt LIKE '%[_]WP%' OR T.proftxt LIKE '%[_]NS%')
GROUP BY T.ao, T.year

-- Add "sum/count" for each ao-year to the _H, _G, and _L records for that year
UPDATE A
SET value = value + CONVERT(FLOAT, T.SumVals) / T.CountOther
FROM tbl_slp A 
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftxt = NULL OR A.proftxt LIKE '%[_]H%' OR A.proftxt LIKE '%[_]G%' OR A.proftxt LIKE '%[_]L%')

-- Now that we've distributed the _WP and _NS values, delete those records
DELETE A
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftxt LIKE '%[_]WP%' OR A.proftxt LIKE '%[_]NS%')
 AND T.CountOther > 0

COMMIT TRANSACTION

For the sample set you gave, this produces the exact same results (except for the an column which I assume was a typo). 对于您提供的样本集，这将产生完全相同的结果（除了我认为是错字的an列）。

Full disclosure, this takes longer on the sample set than your routine does (17 ms compared to your 3 ms), but it should scale up to large data a whole lot better. 完全公开后，样本集所需的时间比常规时间要长（17毫秒比3毫秒），但是它可以更好地扩展到大数据。 I put it in a transaction for correctness but I'm not sure what your exact use case is, so that may be a disadvantage of my way since it will lock the pages (and may escalate to the whole table) for the entire time. 我将它放在事务中是为了确保正确性，但是我不确定您的确切用例是什么，因此这可能会不利于我的方式，因为它会在整个时间内锁定页面（并可能升级到整个表）。 Your routine didn't have any transactions, though, which could lead to bad data so if you keep your way make sure to put each update-delete pair in its own transaction. 但是，您的例行程序没有任何事务，这可能会导致数据出错，因此，如果您坚持自己的方式，请确保将每个更新删除对放在自己的事务中。

Also, if you don't have an index on proftxt , add one! 另外，如果您在proftxt上没有索引，请添加一个！ This will make a huge difference for both solutions. 这将对两种解决方案都产生巨大的影响。

Good luck. 祝好运。 Here's the SQL Fiddle I used . 这是我使用的SQL Fiddle 。

Answer 2

First, I see a couple of NULL-related problems. 首先，我看到一些与NULL相关的问题。 For instance, your inner loop is apparently waiting for @AO to become NULL before it will finish: 例如，您的内部循环显然在等待@AO变为NULL才完成：

WHILE (@AO >1)

This will work when you set @AO to something that isn't there, but it is hard to read, and you probably want to write more explicit logic. 当您将@AO设置为不存在的东西时，这将起作用，但是很难阅读，并且您可能想编写更明确的逻辑。

Next, this condition will always be false: 接下来，此条件将始终为false：

OR proftxt = NULL

The NULL value is not equal to itself. NULL值不等于其自身。 To test for this condition you would have to write: 要测试这种情况，您必须编写：

OR proftxt IS NULL

Also, any NULL values will be omitted from your COUNT(proftxt). 另外，您的COUNT（proftxt）中将忽略所有NULL值。 Try running the following sample query. 尝试运行以下示例查询。 It returns 1, along with the message "Warning: Null value is eliminated by an aggregate or other SET operation". 它返回1，并显示消息“警告：通过聚合或其他SET操作消除了空值”。

SELECT COUNT(fieldname) FROM (SELECT 1 AS fieldname UNION SELECT NULL AS fieldname) AS tablename

Finally, indexing the proftxt column won't fix your performance problems , because a LIKE condition with a leading wildcard can't use the index. 最后，为proftxt列建立索引不会解决您的性能问题 ，因为带有前导通配符的LIKE条件无法使用该索引。 You can think of an index like a telephone book that is alphabetized by last name. 您可以想到索引，例如电话簿，按姓氏字母顺序排列。 If you are looking for LastName LIKE '%mann', the index won't help you. 如果您正在寻找LastName LIKE'％mann'，那么索引将无济于事。 You will still have to read through every entry in the telephone book to find all the last names ending in "mann". 您仍然必须通读电话簿中的每个条目，以找到所有以“ mann”结尾的姓氏。 In database terms, that is called a "table scan", and is slow. 用数据库术语来说，这称为“表扫描”，并且很慢。

I would add a new column, which you could call proftxttype. 我将添加一个新列，您可以将其称为proftxttype。

UPDATE tbl_SLP
SET proftxttype = 1
WHERE proftxt LIKE '%[_]WP%' 
OR proftxt LIKE '%[_]NS%'

UPDATE tbl_SLP
SET proftxttype = 2
WHERE proftxt LIKE '%[_]H%'
OR proftxt LIKE '%[_]G%'
OR proftxt LIKE '%[_]L%'
OR proftxt IS NULL

Then index this column: 然后索引此列：

CREATE NONCLUSTERED INDEX [IX_PROFTXTTYPE] ON [dbo].[TBL_SLP] (PROFTXTTYPE ASC) ON [PRIMARY]

Now rewrite your update in terms of proftxttype. 现在，根据proftxttype重写您的更新。 Of course, whenever you insert or update proftxt, you will also have to update proftxttype. 当然，每当您插入或更新proftxt时，也将必须更新proftxttype。 That is inavoidable, but SQL Server will take care of keeping the index up to date, so you don't have to worry about the index. 这是不可避免的，但是SQL Server将负责使索引保持最新状态，因此您不必担心索引。

I know this sounds like a lot of work, but the core of your problem is that you're scanning through the entire table every time you want to find a proftxt value with a leading wildcard. 我知道这听起来需要做很多工作，但是问题的核心在于，每当您要查找带有前导通配符的proftxt值时，便要扫描整个表。

Answer 3

I combined both (really helpful!) answers. 我结合了两个（非常有帮助！）答案。 I added, as criticalfix told me, a coloum proftype to set an index on the table: 正如criticalfix告诉我的那样，我添加了一个coloum proftype来在表上设置索引：

ALTER TABLE
ADD proftype CHAR(1)
GO

UPDATE tbl_SLPverrechnetWPNSP 
SET proftype = 'W'
WHERE proftxt LIKE '%[_]WP%' 

UPDATE tbl_SLP 
SET proftype = 'N'
WHERE proftxt LIKE '%[_]NS%'

UPDATE tbl_SLP
SET proftype = 'H'
WHERE proftxt LIKE '%[_]H%'
    OR proftxt IS NULL

UPDATE tbl_SLP 
SET proftype = 'G'
WHERE proftxt LIKE '%[_]G%'

UPDATE tbl_SLP
SET proftype = 'L'
WHERE proftxt LIKE '%[_]L%'

--set index on proftype
CREATE NONCLUSTERED INDEX [IX_PROFTYPE] ON [dbo].[tbl_SLP] (proftype ASC) ON [PRIMARY]
GO

Next I used the code from bob to edit my table. 接下来，我使用bob中的代码来编辑表。

SET XACT_ABORT ON
SET NOCOUNT ON

BEGIN TRANSACTION

-- Create a temp table with each ao-year's sums and counts (sums of N and W record values and counts of H, G, and L records)
SELECT T.ao, T.year, SUM(CONVERT(float, T.value)) AS SumVals, (SELECT COUNT(*) 
                                                FROM tbl_slp A 
                                                WHERE A.ao = T.ao 
                                                    AND A.year = T.year 
                                                    AND (A.proftype ='G' OR A.proftype = 'H' OR A.proftype = 'L' )) 
                                                AS CountOther
INTO #temp1 
FROM tbl_slp T
WHERE (T.proftype = 'W' OR T.proftype = 'N')
GROUP BY T.ao, T.year

-- Add "sum/count" for each ao-year to the H, G, and L records for that year
UPDATE A
SET value = value + CONVERT(FLOAT, T.SumVals) / T.CountOther
FROM tbl_slp A 
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftype = 'H' OR A.proftype = 'G' OR A.proftype LIKE 'L')

-- Now that we've distributed the W and N values, delete those records
DELETE A
FROM tbl_slp A
INNER JOIN #temp1 T ON A.ao = T.ao AND A.year = T.year
WHERE (A.proftype = 'W' OR A.proftype = 'N')
 AND T.CountOther > 0

 DROP TABLE #temp1

COMMIT TRANSACTION

Thank you so much for the help! 非常感谢你的帮助！ The routine ran only 3,5 minutes!!! 例行程序仅运行了3.5分钟！！！

更新/删除查询的性能

问题描述

3 个解决方案

解决方案1
5 2013-05-06 15:24:33

解决方案2
1 2013-05-06 15:34:24

解决方案3
1 已采纳 2013-05-08 18:21:36

更新/删除查询的性能

问题描述

3 个解决方案

解决方案1 5 2013-05-06 15:24:33

解决方案2 1 2013-05-06 15:34:24

解决方案3 1 已采纳 2013-05-08 18:21:36

解决方案1
5 2013-05-06 15:24:33

解决方案2
1 2013-05-06 15:34:24

解决方案3
1 已采纳 2013-05-08 18:21:36