[英]UPDATE FROM in Azure SQL DW?
我在嘗試執行 UPDATE FROM 查詢時在 Azure SQL DW 中收到錯誤。 錯誤是“UPDATE 和 DELETE 語句中的 FROM 子句不能包含子查詢源或連接”
這只是特定於 SQL DW 嗎? 否則,我認為此查詢沒有任何問題。 如果是 SQL DW 的限制,有什么替代方案?
-- Permanent fact table with 5 billion rows
CREATE TABLE FactTable (Id1 INT, Id2 INT, EmailAddress NVARCHAR(100), Value1 INT)
WITH (DISTRIBUTION = HASH(EmailAddress));
-- Staging fact table with 10 million rows
CREATE TABLE StageTable (Id1 INT, Id2 INT, EmailAddress NVARCHAR(100), Value1 INT)
WITH (DISTRIBUTION = HASH(EmailAddress), HEAP);
-- Add a secondary index that should help with joining to StageTable
CREATE NONCLUSTERED INDEX ix ON FactTable (Id1, Id2);
UPDATE fact
SET
Value1 = CASE WHEN stage.Value1 > fact.Value1 THEN stage.Value1 ELSE fact.Value1 END
FROM FactTable AS fact
INNER JOIN StageTable AS stage ON fact.Id1 = stage.Id1 AND fact.Id2 = stage.Id2
根據文檔Azure SQL 數據倉庫確實支持UPDATE
但不支持FROM
子句中的 ANSI 連接。 您可以使用 CTAS 來解決。 簡單的兩表更新:
UPDATE dbo.FactTable
SET
Value1 = CASE WHEN stage.Value1 > dbo.FactTable.Value1 THEN stage.Value1 ELSE dbo.FactTable.Value1 END
FROM dbo.StageTable AS stage
WHERE dbo.FactTable.Id1 = stage.Id1
AND dbo.FactTable.Id2 = stage.Id2;
更復雜的 CTAS 示例,從主更新文檔頁面批發復制:
-- Create an interim table
CREATE TABLE CTAS_acs
WITH (DISTRIBUTION = ROUND_ROBIN)
AS
SELECT ISNULL(CAST([EnglishProductCategoryName] AS NVARCHAR(50)),0) AS [EnglishProductCategoryName]
, ISNULL(CAST([CalendarYear] AS SMALLINT),0) AS [CalendarYear]
, ISNULL(CAST(SUM([SalesAmount]) AS MONEY),0) AS [TotalSalesAmount]
FROM [dbo].[FactInternetSales] AS s
JOIN [dbo].[DimDate] AS d ON s.[OrderDateKey] = d.[DateKey]
JOIN [dbo].[DimProduct] AS p ON s.[ProductKey] = p.[ProductKey]
JOIN [dbo].[DimProductSubCategory] AS u ON p.[ProductSubcategoryKey] = u.[ProductSubcategoryKey]
JOIN [dbo].[DimProductCategory] AS c ON u.[ProductCategoryKey] = c.[ProductCategoryKey]
WHERE [CalendarYear] = 2004
GROUP BY
[EnglishProductCategoryName]
, [CalendarYear]
;
-- Use an implicit join to perform the update
UPDATE AnnualCategorySales
SET AnnualCategorySales.TotalSalesAmount = CTAS_ACS.TotalSalesAmount
FROM CTAS_acs
WHERE CTAS_acs.[EnglishProductCategoryName] = AnnualCategorySales.[EnglishProductCategoryName]
AND CTAS_acs.[CalendarYear] = AnnualCategorySales.[CalendarYear]
;
--Drop the interim table
DROP TABLE CTAS_acs
;
我發現使用 ASDW(和 APS/PDW)避免像瘟疫一樣的批量更新是一個很好的做法。
這是一個純粹的 CTAS 替代方案,在您更新大量行的情況下會更快。
它假設 id1 是一個相對較好的分布鍵,並且暫存行數少於事實行數,從而使復制可行。 此策略應消除節點之間的數據移動。
如果您有一個非常大的臨時表,在每個表中創建一個由 id1 和 id2 組合而成的代理列,然后通過該列的散列分布兩個表,將提供更好的性能。
create table FactTable (
id1 int,
id2 int,
value1 int)
with (distribution = hash(id1));
create table StageTable (
id1 int,
id2 int,
value1 int)
with (distribution = replicate);
create table UpdatedFact
with (distribution = hash(id1))
as
select f.id1,
f.id2,
case when s.id1 is not null and s.value1 > f.value1
then s.value1
else f.value1
end as value1
from FactTable f
left outer join StageTable s
on s.id1 = f.id1
and s.id2 = f.id2
truncate table FactTable;
alter table UpdatedFact switch to FactTable;
drop table UpdatedFact;
簡化你的嘗試是可行的。 只需去掉連接並從另一個表更新一個表。
update FactTable
set this = that
from StageTable s where s.something = FactTable.something
這是否是最佳方法取決於您的情況,但它會執行而不會引發錯誤。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.