简体   繁体   English

SQL Server 2008 XML数据类型是否存在性能问题

[英]Does SQL server 2008 XML data type have performance issues

Hi I have a need to store hundreds if not thousands of elements in the database as XML. 嗨,我需要将数百个(如果不是数千个)元素存储为XML。 I will not index anything in the XML field. 我不会在XML字段中建立任何索引。 I will simply select certain elements within the xml. 我将只选择xml中的某些元素。 I would like to know if there is any performance penalty for simply selecting fields in the XML. 我想知道仅选择XML中的字段是否会对性能造成任何影响。 Here is example XML that will be stored in the database. 这是将存储在数据库中的示例XML。

<fields>
    <field name="FirstName" type="text" value="Gary" sort="2" />
    <field name="LastName" type="text" value="Smith" sort="3" />
    <field name="City" type="text" value="Los Angeles" sort="4" />
    <field name="Age" type="number" value="12" sort="6" />
    <field name="Address" type="text" sort="2">
        <streetnumber value="1234" />
        <streetname value="sail" />
    </field>
</fields>

I will probably have more than 3000 field tags in one record. 一条记录中可能会包含3000多个字段标签。 I simply want to get 10 fields in a single query. 我只想在一个查询中获取10个字段。 I will have a primary key on the table and will be selecting records based on the primary key but will be getting fields from the XML column. 我将在表上有一个主键,并将基于该主键选择记录,但将从XML列获取字段。 I am afraid the more field elements I put in the XML will compromise performance. 恐怕我在XML中放入的更多字段元素会损害性能。 Will there be a performance penalty for simply selecting 10 or more fields from the XML column? 仅从XML列中选择10个或更多字段,是否会对性能造成影响? Also, I will not be using the xml column in a where clause I will use the primary in the where clause then I will select fields from the XML column. 另外,我将不会在where子句中使用xml列,而会在where子句中使用主列,然后从XML列中选择字段。 Will there be a performance penalty? 会不会有性能损失?

Based on my experience on XML in SQL Server Xml datatype, and on Indexes on XML Data Type Columns (the whole section deserves thorough reading) 基于我对SQL Server Xml数据类型中的XMLXML数据类型列上的索引的经验(整个部分值得深入阅读)

Will there be a performance penalty for simply selecting 10 or more fields from the XML column ? 仅从XML列中选择10个或更多字段会不会降低性能?

Yes, because your XML document is stored as a blob. 是的,因为您的XML文档存储为Blob。 Without a primary XML index, this blob will need to be exploded for query processing (filtering and projection) As to XML, indexes can be seen as a relational representation of your document (pre-exploding the blob) 如果没有主XML索引,则需要将该Blob分解以进行查询处理(过滤和投影)。对于XML,索引可以看作是文档的关系表示形式(预分解Blob)

Without an index, these binary large objects are shredded at run time to evaluate a query. 没有索引,这些二进制大对象将在运行时被粉碎以评估查询。 This shredding can be time-consuming 切碎可能很耗时

As to your second question 关于第二个问题

Also, I will not be using the xml column in a where clause I will use the primary in the where clause then I will select fields from the XML column. 另外,我将不会在where子句中使用xml列,而会在where子句中使用主列,然后从XML列中选择字段。 Will there be a performance penalty? 会不会有性能损失?

If you are going to project among 3000 field tags, you might benefit from a secondary XML index, though I'm not sure which one. 如果要在3000个字段标签中进行投影,则可能会受益于辅助XML索引,尽管我不确定是哪个索引。 PROPERTY secondary index seems fit for projection, but it seems to apply on value calls (the french documentation seems to imply more than just value calls but that may be some translation mistake) PROPERTY二级索引似乎适合预测,但似乎适用于value调用(法语文档似乎暗示着不仅仅是value调用,但这可能是一些翻译错误)

For my part, I ended-up setting the three kind of secondary indexes on my XML column (1 million documents on 30 different schemas, 50-100 elements each) But my app requires a lot more filtering than projection. 就我而言,我最终在我的XML列上设置了三种类型的二级索引(一百万个文档位于30个不同的模式上,每个文档有50-100个元素),但是我的应用需要比投影更多的过滤。

[BEGIN EDIT] [开始编辑]
jbl's direct answers to your questions, and Terror.Blade's answer re' XML being better than NVARCHAR(MAX) both make sense (I upvoted them :). jbl对您问题的直接回答,以及Terror.Blade对XML的回答都优于NVARCHAR(MAX)都是有意义的(我赞成它们:)。

My experience was without storing an XML schema in SQL Server (Terror.Blade's tip), and without indexing (jbl gave the most, re' that)... but I'm leaving my answer, because I think my links could be very helpful... and it's still an example of worst case ;) 我的经验是没有在SQL Server中存储XML模式(Terror.Blade的技巧),也没有索引(jbl发挥了最大作用,是那样)...但是我要回答我,因为我认为我的链接可能非常有用的...这仍然是最坏情况的一个例子;)
[END EDIT] [结束编辑]

From experience, I'll say that the loading of an XML data type is quick, but as for using it -- I found that to be slow, but the personal example coming to mind involved updating, and using xQuery, and those may have been factors in my slowdown. 根据经验,我会说XML数据类型的加载是快速的,但是就使用它而言-我发现它很慢,但是我想到的个人示例涉及更新和使用xQuery,并且这些示例可能是我减速的因素。 In that example, it took 1hr55mins to process only 127,861 rows. 在该示例中,仅花费了1小时55分钟来处理127,861行。 (Terror.Blade's tip, of storing an XML schema in SQL Server, and jbl's link & share re' XML indexing both sound pretty slick ;) and might address that slowdown.) (Terror.Blade的技巧,在SQL Server中存储XML架构,以及jbl的link&share re'XML索引都听起来很漂亮;)并且可能解决了这种问题。)

RELATED : Here's some tips re' optimizing XML in SQL... though some of them only apply if you have control over the format of the XML: 相关文章 :这里有一些关于在SQL中优化XML的技巧...尽管其中一些技巧仅在您可以控制XML格式时才适用:
http://msdn.microsoft.com/en-us/library/ms345118.aspx http://msdn.microsoft.com/en-us/library/ms345118.aspx

If you're using xQuery, check out these docs: 如果您使用的是xQuery,请查看以下文档:
http://download.microsoft.com/download/0/F/B/0FBFAA46-2BFD-478F-8E56-7BF3C672DF9D/XQuery%20Language%20Reference.pdf http://download.microsoft.com/download/0/F/B/0FBFAA46-2BFD-478F-8E56-7BF3C672DF9D/XQuery%20Language%20Reference.pdf

((And if you're using SQLXMLBulkLoad at all, consider using "overflow-field"s, to capture whatever is not defined in your schema. There's some useful tips in this tangentially related TechNote: (((如果您完全使用SQLXMLBulkLoad,请考虑使用“溢出字段”,以捕获架构中未定义的内容。此与切向相关的 TechNote中有一些有用的技巧:
http://social.technet.microsoft.com/Forums/sqlserver/en-US/393cf604-bf6e-488b-a1ea-2e984aa14500/how-do-i-redirect-xml-comments-that-sqlxmlbulkload-is-storing-in-the-overflowfield?forum=sqlxml )) http://social.technet.microsoft.com/Forums/sqlserver/en-US/393cf604-bf6e-488b-a1ea-2e984aa14500/how-do-i-redirect-xml-comments-that-sqlxmlbulkload-is-storing- in-the-overflowfield?forum = sqlxml ))

HTH. HTH。

I realize that this is not a direct answer to the OP's question (although, it's related to it), but I think this could really help many people that have been redirected here looking for some idea on how to deal with the poor performance of the XML data type on SQL Server. 我意识到这不是对OP的问题的直接答案(尽管与它有关),但我认为这确实可以帮助许多已经转向此处的人寻找如何应对OS较差性能的想法。 SQL Server上的XML数据类型。 After many years struggling with this issue, I finally found a solution that, for some reason, is not that easy to come by: 经过多年的努力,我终于找到了一个由于某种原因不容易实现的解决方案:

SXI - Selective XML Indexes (starting with SQL 2008) SXI-选择性XML索引 (从SQL 2008开始)

MS Docs link: https://docs.microsoft.com/en-us/sql/relational-databases/xml/selective-xml-indexes-sxi?view=sql-server-2017 MS Docs链接: https : //docs.microsoft.com/zh-cn/sql/relational-databases/xml/selective-xml-indexes-sxi?view= sql-server- 2017

On my local tests with tables containing 3MM+ records, it worked amazingly well! 在我的包含3MM +记录的表的本地测试中,它的运行效果非常好!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM