![](/img/trans.png)
[英]Is it possible to write a T-SQL query for “Get largest 2” that has O(n) run-time complexity?
[英]T-SQL UDF vs full expression run-time
我試圖通過在SQL SERVER中使用UDF來使查詢可讀,但是使用該函數時運行時間會急劇增加。
以下是我正在使用的功能:
create function DL.trim_all(@input varchar(max))
returns varchar(max)
as begin
set @input=replace(replace(replace(@input,' ',''),')',''),'(','')
return @input
end
而不是寫:
SELECT
CASE WHEN replace(replace(replace([FULL_NAME_1],' ',''),')',''),'(','')=replace(replace(replace([FULL_NAME_2],' ',''),')',''),'(','') THEN 1 ELSE 0 END AS [name_match],
CASE WHEN replace(replace(replace([ADDRESS_1],' ',''),')',''),'(','')=replace(replace(replace([ADDRESS_2],' ',''),')',''),'(','') THEN 1 ELSE 0 END AS [adrs_match]
.
.
.
FROM
TABLE_1
適用於20個不同領域。
使用該功能時,運行時間為12.5分鍾,而當不使用該功能時,運行時間為45秒。
有任何想法嗎?
將John的想法更進一步,將標量函數轉換為內聯表函數,並使用交叉應用為每對列激活它-您可能會獲得更好的性能,但代價是查詢更加麻煩:
CREATE function DL.DoesItMatch(@s1 varchar(500),@s2 varchar(500))
returns table -- returns a table with a single row and a single column
as return
SELECT
CASE WHEN replace(replace(replace(@s1,' ',''),')',''),'(','') =
replace(replace(replace(@s2,' ',''),')',''),'(','') THEN 1 ELSE 0 END As IsMatch;
和查詢:
SELECT NameMatch.IsMatch AS [name_match],
AddressMatch.IsMatch AS adrs_match
.
.
.
FROM TABLE_1
CROSS APPLY DL.DoesItMatch(FULL_NAME_1, FULL_NAME_2) As NameMatch
CROSS APPLY DL.DoesItMatch(ADDRESS_1, ADDRESS_2) As AddressMatch
無法想象巨大的提升,但是另一種方法呢
create function DL.DoesItMatch(@s1 varchar(500),@s2 varchar(500))
returns bit
as begin
return CASE WHEN replace(replace(replace(@s1,' ',''),')',''),'(','')=replace(replace(replace(@s2,' ',''),')',''),'(','') THEN 1 ELSE 0 END
end
然后將函數調用為:
SELECT
DL.DoesItMatch([FULL_NAME_1],[FULL_NAME_2]) AS [name_match],
...
FROM
TABLE_1
內聯始終是要走的路。 期。 即使不考慮限制並行性的T-SQL標量UDF的方面-ITVF速度更快,所需資源(CPU,內存和IO)更少,易於維護,並且更易於故障排除/分析/配置文件/跟蹤。 為了好玩,我進行了一項性能測試,將Zohar的ITVF與John的標量UDF進行了比較。 我創建了25萬行,針對兩者都測試了一個基本選擇,然后對堆進行了另一個ORDER BY
測試以強制排序。
樣本數據:
-- Sample Data
BEGIN
SET NOCOUNT ON;
IF OBJECT_ID('tempdb..#tmp','U') IS NOT NULL DROP TABLE #tmp;
SELECT TOP (250000) col1 = '('+LEFT(NEWID(),10)+')', col2 = '('+LEFT(NEWID(),10)+')'
INTO #tmp
FROM sys.all_columns a, sys.all_columns;
UPDATE #tmp SET col1 = col2 WHERE LEFT(col1,2) = LEFT(col2,2)
END
性能測試:
PRINT 'scalar, no sort'+CHAR(10)+REPLICATE('-',60);
GO
DECLARE @st DATETIME = GETDATE(), @isMatch BIT;
SELECT @isMatch = DL.DoesItMatch(t.col1,t.col2)
FROM #tmp AS t;
PRINT DATEDIFF(MS,@st,GETDATE())
GO 3
PRINT CHAR(10)+'ITVF, no sort'+CHAR(10)+REPLICATE('-',60);
GO
DECLARE @st DATETIME = GETDATE(), @isMatch BIT;
SELECT @isMatch = f.isMatch
FROM #tmp AS t
CROSS APPLY DL.DoesItMatch_ITVF(t.col1,t.col2) AS f;
PRINT DATEDIFF(MS,@st,GETDATE())
GO 3
PRINT CHAR(10)+'scalar, sorted set'+CHAR(10)+REPLICATE('-',60);
GO
DECLARE @st DATETIME = GETDATE(), @isMatch BIT;
SELECT @isMatch = DL.DoesItMatch(t.col1,t.col2)
FROM #tmp AS t
ORDER BY DL.DoesItMatch(t.col1,t.col2);
PRINT DATEDIFF(MS,@st,GETDATE())
GO 3
PRINT CHAR(10)+'ITVF, sorted set'+CHAR(10)+REPLICATE('-',60);
GO
DECLARE @st DATETIME = GETDATE(), @isMatch BIT;
SELECT @isMatch = f.isMatch
FROM #tmp AS t
CROSS APPLY DL.DoesItMatch_ITVF(t.col1,t.col2) AS f
ORDER BY f.isMatch;
PRINT DATEDIFF(MS,@st,GETDATE())
GO 3
檢測結果:
scalar, no sort
------------------------------------------------------------
Beginning execution loop
844
843
840
Batch execution completed 3 times.
ITVF, no sort
------------------------------------------------------------
Beginning execution loop
270
270
270
Batch execution completed 3 times.
scalar, sorted set
------------------------------------------------------------
Beginning execution loop
937
930
936
Batch execution completed 3 times.
ITVF, sorted set
------------------------------------------------------------
Beginning execution loop
196
190
190
Batch execution completed 3 times.
因此,當不需要並行計划時,ITVF快3倍,而需要並行計划時,ITVF快5倍。 這是我測試ITVF與(標量和多語句表值UDF)的其他一些鏈接。
您可以在SQL Server 2019中使用Scalar UDF內聯。這樣,您將能夠保留您編寫的相同UDF,並自動獲得與沒有UDF的查詢相同的性能。
您提供的UDF符合可嵌入性的標准,因此您的身體狀況良好。 有關UDF內聯功能的文檔位於: https ://docs.microsoft.com/zh-cn/sql/relational-databases/user-defined-functions/scalar-udf-inlining?view = azuresqldb-current
專家提示:建議您在使用Scalar UDF內聯之前,對UDF進行較小的修改。 通過避免局部變量,使其成為單個語句標量UDF。 這樣,您比使用帶有交叉應用的嵌入式TVF更好。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.