简体   繁体   English

具有视图的数据库查询速度变慢

[英]Database Query Slower with View

I kind of have a feel for why the view is slower: The where clause is probably not applied at the same time. 我对为什么视图变慢有某种感觉:where子句可能不会同时应用。 The results do seem to be the same, though. 结果似乎确实是一样的。 I am not sure what I can do about this, short of not using a view...which is not ideal, as I added the view to avoid code repetition, and I don't want to remove it if it isn't necessary. 我不确定该怎么做,除非不使用视图...这是不理想的,因为我添加了该视图以避免代码重复,并且我不想在不需要时删除它。

Any suggestions for to change the way I am doing this so that I can use a view like in Command 1 but still have my query be executed as quickly as it is executed in command 2? 有什么建议可以改变我执行此操作的方式,以便可以像在命令1中那样使用视图,但是仍然可以像在命令2中一样快地执行我的查询?

declare @foo varchar(50)
set @foo = 'be%'

ALTER VIEW [dbo].[wpvw_v]
AS
select distinct [name]
from kvgs kvg left join cdes cde
on kvg.kvgi = cde.kgi
group by [name], cde.kgi, kvg.mU
having count(cde.kgi) >= 2 or kvg.mU = 1 or 
   exists (select [name] from FP x where x.name = kvg.name) 

--Command 1: Takes 7 seconds
select [name] from wpvw_v where name like @foo

--Command 2: Takes 1 second
SELECT DISTINCT kvg.name
FROM         dbo.kvgs AS kvg LEFT JOIN
                      dbo.cdes AS cde ON kvg.kvgi = cde.kgi
where name like @foo
GROUP BY kvg.name, cde.kgi, kvg.mU
HAVING      (COUNT(cde.kgi) >= 2) OR
                      (kvg.mU = 1) OR
                      EXISTS
                          (SELECT     Name
                            FROM          dbo.FP AS x
                            WHERE      (Name = kvg.name))

Your query from view is like this: 您从视图中查询是这样的:

SELECT name FROM (SELECT DISTINCT name FROM ...) WHERE name = @name;

while the second one is: 而第二个是:

SELECT DISTINCT name FROM ... WHERE name = @name;

The two queries are very different and even though they produce the same result, the fiurst one can be answered only if the entire table is scanned to produce the distinct names, while the second one can scan only the names you're interested in. 这两个查询有很大的不同,即使它们产生相同的结果,也只有在扫描整个表以产生不同的名称时才能回答第一个查询,而第二个查询只能扫描您感兴趣的名称。

The gist of the problem is that the presence of DISTINCT places a barrier that does not allow for the filtering predicate to move down the query tree to a place where is effective. 该问题的要点在于, DISTINCT的存在造成了一个障碍,该障碍不允许过滤谓词沿查询树向下移动到有效的位置。

Update 更新

Even if DISTINCT is not a barrier, on second look the second look there is a even more powerful barrier there: the GROUP BY/HAVING clause. 即使DISTINCT不是障碍,但从第二个角度看,第二个角度上仍然存在一个更强大的障碍:GROUP BY / HAVING子句。 One query filters after the GROUP and HAVING condition was applied, the other one before. 一个查询在应用GROUP和HAVING条件之后过滤,另一个在查询之前过滤。 And the HAVING condition has subqueries that reference name again. 并且HAVING条件具有再次引用该name子查询。 I doubt the QO can proove the equivalence of the filtering before the aggregate and filtering after the aggregate. 我怀疑QO是否可以证明聚合之前的过滤和聚合之后的过滤的等效性。

I didn't think the HAVING clause could accommodate what you'd posted, but I believe your view should be written to use UNIONs instead. 我认为HAVING子句不能容纳您发布的内容,但我认为您的视图应改为使用UNION。 Here's my take on it: 这是我的看法:

ALTER VIEW [dbo].[wpvw_v] AS
WITH names AS(
  SELECT k.name
    FROM KVGS k 
   WHERE EXISTS(SELECT NULL
                  FROM CDES c
                 WHERE c.kgi = k.kvgi
              GROUP BY c.kgi
                HAVING COUNT(c.kgi) > 1)
  UNION ALL
  SELECT k.name
    FROM KVGS k 
   WHERE k.mu = 1
GROUP BY k.name
  UNION ALL
  SELECT k.name
    FROM KVGS k 
    JOIN FP x ON x.name = k.name
GROUP BY k.name)
SELECT n.name
  FROM names n

If you want to filter out duplicates between the 3 SQL statements, change UNION ALL to UNION . 如果要过滤3条SQL语句之间的重复项,请将UNION ALL更改为UNION Then you can use: 然后,您可以使用:

SELECT n.name
  FROM wpvw_v n
 WHERE CHARINDEX(@name, n.name) > 0

As far as I know, the full result set of the view is collected and then further widdled down by the SELECT statement that uses it. 据我所知,视图的完整结果集被收集,然后进一步被使用该视图的SELECT语句所困扰。 This is very different from your second SELECT statement, which doesn't collect any more than it needs. 这与您的第二条SELECT语句有很大不同,第二条SELECT语句收集的内容不超过所需数量。

You could try an inline tabled function ( http://www.sqlhacks.com/index.php/Retrieve/Parameterized-View ) but tbh I see that as a bit of a hack. 您可以尝试使用内联表函数( http://www.sqlhacks.com/index.php/Retrieve/Parameterized-View ),但是我认为这有点hack。

Honestly I'd probably go for the code repetition. 老实说,我可能会去重复代码。 I don't really see SQL in the same way that I see other code - I keep on seeing vast differences in performance between otherwise logically equivalent statements. 我看SQL的方式与看其他代码的方式并不完全相同-我一直在观察逻辑上等效的语句之间的性能差异。

Without seeing more (the CREATE TABLE, INDEX, and CONSTRAINT statements for each table, for example) and preferably seeing the query plans as some sample data representative of the cardinality of the join as well, it's hard to say. 没有看到更多信息(例如,每个表的CREATE TABLE,INDEX和CONSTRAINT语句),并且最好将查询计划也视为代表联接基数的一些示例数据,这很难说。

Possibly, there is a semantic difference between the queries that has to do with the collation under which the LIKE expression is evaluated, and it might be impossible to coax the same plan. 在查询之间可能存在语义上的差异,这与评估LIKE表达式所依据的排序规则有关,并且可能无法哄骗相同的计划。

However, there is probably plenty of room for query tuning here. 但是,这里可能还有很大的查询调优空间。 It seems unlikely you need to fully aggregate all the COUNT()s. 您似乎不太可能需要完全汇总所有COUNT()。 You have three rather distinct conditions under which you want to see a "name" in your result. 您要在结果中看到“名称”的三个非常不同的条件。 With UNION you might be able to make one or more of them simpler to calculate, and if concurrency isn't an issue, you might even write this as a multi-step user-defined table-valued function that accumulates the names in separate steps. 使用UNION,您可以使它们中的一个或多个更易于计算,并且如果并发不是问题,您甚至可以将其编写为用户定义的多步骤表值函数,该函数在单独的步骤中累积名称。

I believe the following reproduces your problem: 我相信以下内容会重现您的问题:

create table tbl (idx int identity(1,1), name varchar(50), val float)

declare @cnt int
set @cnt=0
while @cnt < 10000
begin
insert tbl select char(CAST(rand()*256 AS INT)), rand()
set @cnt = @cnt + 1
end
go
create view tbl_view as select distinct name from tbl group by name having sum(val) > 1

Then if you run the following query: 然后,如果您运行以下查询:

SET STATISTICS IO ON
declare @n varchar(50)
set @n='w%'
select * from tbl_view where name like @n
SET STATISTICS IO OFF
GO
SET STATISTICS IO ON
declare @n varchar(50)
set @n='w%'
select distinct name from tbl where name like @n group by name having sum(val) > 1
SET STATISTICS IO OFF

You get the following: 您得到以下信息:

(1 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tbl'. Scan count 1, logical reads 338, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table 'tbl'. Scan count 1, logical reads 338, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The view forces it to work off a sub-table first and only then to apply the filter. 该视图强制它首先处理子表,然后才应用过滤器。 Now, if you modify the view and remove the DISTINCT , this does not change. 现在,如果您修改视图并删除DISTINCT ,这不会改变。 But if you modify the view to remove the group by: 但是,如果您修改视图以通过以下方式删除组:

create view tbl_view as select name from tbl where val > 0.8 group by name 
go
SET STATISTICS IO ON
declare @n varchar(50)
set @n='w%'
select * from tbl_view where name like @n
SET STATISTICS IO OFF
GO
SET STATISTICS IO ON
declare @n varchar(50)
set @n='w%'
select name from tbl where val > 0.8 and name like @n group by name
SET STATISTICS IO OFF

Then you get the same results for both queries: 然后,两个查询的结果相同:

(1 row(s) affected)
Table 'tbl'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table 'tbl'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So it does seem like the HAVING is the barrier. 因此,似乎HAVING是障碍。

Table-valued functions tend to be quicker than views, assuming your WHERE conditions are known and can be supplied in parameters. 假定您的WHERE条件是已知的并且可以在参数中提供,则表值函数往往比视图快。

One advantage of table-valued functions is that you can have multiple statements, so you can convert OUTER JOINs to quicker INNER JOINS in subsequent statements. 表值函数的一个优点是可以有多个语句,因此可以在后续语句中将OUTER JOIN转换为更快的INNER JOINS。 So instead of this: 所以代替这个:

INSERT INTO @resultTable
    table1_id,
    table1_column,
    table2_column,
    table3_column
SELECT
    table1.id,
    table1.column,
    table2.column,
    table3.column
FROM
    table1
    INNER JOIN table2 ON table2.table1_id = table1.id
    LEFT OUTER JOIN table3 ON table3.table1_id = table1.id

return @resultTable

... you can do this, which I find is always faster: ...您可以这样做,我发现它总是更快:

INSERT INTO @resultTable
    table1_id,
    table1_column,
    table2_column,
SELECT
    table1.id,
    table1.column,
    table2.column,
FROM
    table1
    INNER JOIN table2 ON table2.table1_id = table1.id

UPDATE @resultTable SET
    table3_column = table3.column
FROM @resultTable AS result
    INNER JOIN table3 ON table3.table1_id = result.table1_id

return @resultTable

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM