简体   繁体   English

为什么在SELECT查询中包含XML列会对查询性能产生如此根本的负面影响?

[英]Why does the inclusion of an XML column in a SELECT query have such a radically negative effect on query performance?

I've been struggling with a query performance issue for a few weeks now. 几周来,我一直在努力解决查询性能问题。 At this point I've squeezed absolutely everything out of the query in terms of JOIN types, Indexing, Keeping Statistics up to date, etc... etc... but then I stumbled on something by accident. 在这一点上,我已经从JOIN类型,索引,保持统计信息最新等等方面完全挤出了查询中的所有内容,但是后来我偶然发现了一些东西。

A little background. 一点背景。

The table in question represents a Record 有问题的表代表一条Record

Id INT PK
Name NVARCHAR(50)
Status INT FK 
Created DATETIME
Version NVARCHAR(10)
Data XML

After some performance benchmarking, I realised that the inclusion of the final column in the select far outweighs things like indexing, join complexity & network considerations by somewhere between a factor 10x & 20x. 经过一些性能基准测试后,我意识到选择中的最后一列要比索引,连接复杂性和网络注意事项高出10倍至20倍。

The following comparisons were done between SSMS on local dev machine connecting to SQL Azure. 在连接到SQL Azure的本地开发机上的SSMS之间进行了以下比较。

SELECT Id FROM Records -- ~10 secs for 300,000 rows
SELECT Id, Name, Status, Created, Version FROM Records -- ~20 sec for 300,000 rows
SELECT * FROM Records -- ~350 sec for 300,000 rows

To be clear, I'm not doing anything crazy with the xml column (XML DML or XPath queries). 需要明确的是,我对xml列(XML DML或XPath查询)没有做任何疯狂的事情。 Just simply including/excluding it from the select. 只是简单地从选择中包括/排除它即可。

At this point, I think I've solved my problem by creating a RecordLight Entity, NHibernate Map & MVC Controller stack, purely for the purposes of searching & listing in our App. 在这一点上,我认为我已经通过创建RecordLight实体,NHibernate Map和MVC Controller堆栈解决了我的问题,纯粹是为了在我们的App中搜索和列出。

But I'd like to understand why the inclusion of the XML column is having such a negative effect on Query Performance 但是我想了解为什么包含XML列会对查询性能产生如此负面的影响

One thing to consider is the size in bytes of your XML data. 要考虑的一件事是XML数据的大小(以字节为单位)。

If you're connecting to a remote DB server for example, all that data needs to be downloaded to your client (even if the client is SSMS). 例如,如果要连接到远程数据库服务器,则需要将所有数据下载到客户端(即使客户端是SSMS)。

I've seen the same thing with blob columns which contain MB's of data for example. 我在blob列中看到过同样的事情,例如,其中包含MB数据。

If you do something like: 如果您执行以下操作:

SELECT Id, LEFT(Data, 10) FROM Records

Do you see the same time to return the data? 您看到同一时间返回数据了吗?

Is it something to do with how the XML data is stored in the files that SQL server uses? 与XML数据如何存储在SQL Server使用的文件中有关吗? Would there be similar performance problems with other large data types such as BLOBs? 其他大型数据类型(例如BLOB)是否会出现类似的性能问题? If the actual content of the XML column, which could be a very large file, is spread across other files then I can imagine this is going to take time for SQL to 'stitch' together. 如果XML列的实际内容(可能是一个非常大的文件)分散在其他文件中,那么我可以想象这将需要花费一些时间让SQL“缝合”在一起。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM