[英]SQL UPDATE query - value depends on another rows
There is a SQL Server database temporary table, let it be TableA.有一个SQL Server数据库临时表,假设为TableA。 And the table structure is following:
表结构如下:
CREATE TABLE #TableA
(
ID BIGINT IDENTITY (1, 1) PRIMARY KEY,
MapVal1 BIGINT NOT NULL,
MapVal2 BIGINT NOT NULL,
IsActual BIT NULL
)
The table is already filled with some mappings of MapVal1 to MapVal2.该表已经填充了一些 MapVal1 到 MapVal2 的映射。 The issue is that not all the mappings should be flagged as Actual.
问题是并非所有映射都应标记为实际。 For this reason should be used IsActual column.
为此应使用 IsActual 列。 Currently IsActual is set to NULL for every row.
目前 IsActual 被设置为 NULL 每行。 The task is to create the query for updating IsActual column value.
任务是创建用于更新 IsActual 列值的查询。 UPDATE query should follow next conditions:
UPDATE 查询应遵循以下条件:
To make it clear, here is an example of result I want to obtain:为了清楚起见,这是我想要获得的结果示例:
Result should be that every MapVal1 is mapped to just one MapVal2 and vice varsa every MapVal2 is mapped to just one MapVal1.结果应该是每个 MapVal1 都映射到一个 MapVal2,反之亦然,每个 MapVal2 都映射到一个 MapVal1。
I have created sql-query to resolve my task:我创建了 sql-query 来解决我的任务:
IF OBJECT_ID('tempdb..#TableA') IS NOT NULL
BEGIN
DROP TABLE #TableA
END
CREATE TABLE #TableA
(
ID BIGINT IDENTITY (1, 1) PRIMARY KEY,
MapVal1 BIGINT NOT NULL,
MapVal2 BIGINT NOT NULL,
IsActual BIT NULL
)
-- insert input data
INSERT INTO #TableA (MapVal1, MapVal2)
SELECT 1, 1
UNION ALL SELECT 1, 3
UNION ALL SELECT 1, 4
UNION ALL SELECT 2, 1
UNION ALL SELECT 2, 3
UNION ALL SELECT 2, 4
UNION ALL SELECT 3, 3
UNION ALL SELECT 3, 4
UNION ALL SELECT 4, 3
UNION ALL SELECT 4, 4
UNION ALL SELECT 6, 7
UNION ALL SELECT 7, 8
UNION ALL SELECT 7, 9
UNION ALL SELECT 8, 8
UNION ALL SELECT 8, 9
UNION ALL SELECT 9, 8
UNION ALL SELECT 9, 9
CREATE NONCLUSTERED INDEX IX_Mapping_MapVal1 ON #TableA (MapVal1);
CREATE NONCLUSTERED INDEX IX_Mapping_MapVal2 ON #TableA (MapVal2);
-- UPDATE of #TableA is starting here
-- every one-to-one mapping should be actual
UPDATE m1 SET
m1.IsActual = 1
FROM #TableA m1
LEFT JOIN #TableA m2
ON m1.MapVal1 = m2.MapVal1 AND m1.ID <> m2.ID
LEFT JOIN #TableA m3
ON m1.MapVal2 = m3.MapVal2 AND m1.ID <> m3.ID
WHERE m2.ID IS NULL AND m3.ID IS NULL
-- update for every one-to-many or many-to-many mapping is more complicated
-- would be great to change this part of query to make it witout any LOOP
DECLARE @MapVal1 BIGINT
DECLARE @MapVal2 BIGINT
DECLARE @i BIGINT
DECLARE @iMax BIGINT
DECLARE @LoopCount INT = 0
SELECT
@iMax = MAX (m.ID)
FROM #TableA m
SELECT
@i = MIN (m.ID)
FROM #TableA m
WHERE m.IsActual IS NULL
WHILE @i <= @iMax
BEGIN
SELECT @LoopCount = @LoopCount + 1
SELECT
@MapVal1 = m.MapVal1,
@MapVal2 = m.MapVal2
FROM #TableA m
WHERE m.ID = @i
IF EXISTS
(
SELECT *
FROM #TableA m
WHERE
m.ID < @i
AND
(m.MapVal1 = @MapVal1
OR m.MapVal2 = @MapVal2)
AND m.IsActual IS NULL
)
BEGIN
UPDATE m SET
m.IsActual = 0
FROM #TableA m
WHERE m.ID = @i
END
SELECT @i = MIN (m.ID)
FROM #TableA m
WHERE
m.ID > @i
AND m.IsActual IS NULL
END
UPDATE m SET
m.IsActual = 1
FROM #TableA m
WHERE m.IsActual IS NULL
SELECT * FROM #TableA
but as it was expected performance of the query with LOOP is very bad, specially when input table keep millions of rows.但正如预期的那样,使用 LOOP 的查询性能非常糟糕,特别是当输入表保留数百万行时。 I spent a lot of time trying to produce query without LOOP to get reduce execution time of my query but unsuccessfully.
我花了很多时间尝试在没有 LOOP 的情况下生成查询以减少查询的执行时间但没有成功。
Could anybody advice me how to improve performance of my query.谁能建议我如何提高查询的性能。 It would be great to get query without LOOP.
如果没有 LOOP 就可以获取查询。
Using a loop does not imply you need to update the table one record at a time.使用循环并不意味着您需要一次更新表中的一条记录。 It may help if each individual
UPDATE
statement updates multiple records.如果每个单独的
UPDATE
语句更新多条记录,这可能会有所帮助。
Consider all possible combinations of MapVal1 and MapVal2 as a matrix.将 MapVal1 和 MapVal2 的所有可能组合视为一个矩阵。 Every time you flag a cell as 'actual', you can flag an entire row and an entire column as 'not actual'.
每次将一个单元格标记为“实际”时,您可以将整行和整列标记为“非实际”。
The simplest way to do this, is by following these steps.执行此操作的最简单方法是执行以下步骤。
Here's an implementation:这是一个实现:
SELECT 0 -- force @@ROWCOUNT initially 1
WHILE @@ROWCOUNT > 0
WITH MakeActual AS (
SELECT TOP 1 MapVal1, MapVal2
FROM #TableA
WHERE IsActual IS NULL
ORDER BY MapVal1, MapVal2
)
UPDATE a
SET IsActual = CASE WHEN a.MapVal1 = m.MapVal1 AND a.MapVal2 = m.MapVal2 THEN 1 ELSE 0 END
FROM #TableA a
INNER JOIN MakeActual m ON a.MapVal1 = m.MapVal1 OR a.MapVal2 = m.MapVal2
The number of loop iterations equals the number of 'actual' mappings.循环迭代次数等于“实际”映射的次数。 The actual performance gain depends a lot on the data.
实际性能增益在很大程度上取决于数据。 If the majority of mappings is one-to-one (ie hardly any non-actual mappings), then my algorithm will make little difference.
如果大多数映射是一对一的(即几乎没有任何非实际映射),那么我的算法将没有什么区别。 Therefore, it may be wise to keep the initial
UPDATE
statement from your own code sample (the one with the comment "every one-to-one mapping should be actual").因此,明智的做法是保留您自己的代码示例中的初始
UPDATE
语句(注释为“每个一对一映射都应该是实际的”的示例)。
It may also help to play around with the indexes.玩转索引也可能有所帮助。 This one should help to further optimize the clause
ORDER BY MapVal1, MapVal2
:这应该有助于进一步优化子句
ORDER BY MapVal1, MapVal2
:
CREATE NONCLUSTERED INDEX IX_MapVals ON #TableA (MapVal1, MapVal2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.