[英]sql server Xquery nodes value performance
I have a table with 25,000 rows.我有一个包含 25,000 行的表。 Table Audit (Id int identity(1,1), AdditionalInfo xml) The sample data in AdditionalInfo column for a row looks like below Table Audit (Id int identity(1,1), AdditionalInfo xml) 一行的 AdditionalInfo 列中的示例数据如下所示
<Audit version="1">
<Context name="Event">
<Action name="OrganizationEventReceived">
<Input>
<Source type="SourceOrganizationId">77d2678b-ea4a-43ad-816b-c63edf206b08</Source>
<Target type="TargetOrganizationId">b98fd3ae-dbcb-4826-9d92-7e445ad61273,b98fd3ae-dbcb-4826-9d92-7e445ad61273,b98fd3ae-dbcb-4826-9d92-7e445ad61273</Target>
</Input>
</Action>
</Context>
</Audit>
I like to shred the xml and collect the data in output dataset with following query.我喜欢粉碎 xml 并使用以下查询收集 output 数据集中的数据。
SELECT Id,
p.value('(@name)[1]', 'nvarchar (100)') AS TargetAction,
p.value('(Input/Source/text())[1]', 'nvarchar (500)') AS Source,
p.value('(Input/Target/text())[1]', 'nvarchar (max)') AS Target
FROM dbo.Audit CROSS APPLY AdditionalInfo.nodes('/Audit/Context/Action') AS AdditionalInfo(p)
The performance of the query is bad.查询的性能很差。 It is taking 15 seconds to give the result set for just 25,000 rows.给出仅 25,000 行的结果集需要 15 秒。 Is there a better way of doing it.有没有更好的方法。 I even tried putting primary and secondary xml indexes on AdditionalInfo column.我什至尝试将主要和次要索引 xml 放在 AdditionalInfo 列上。 Please help and let me know, to use better sql server xquery techniques.请帮助并让我知道,使用更好的 sql 服务器 xquery 技术。
Thanks,谢谢,
Great question.很好的问题。
My recent task requires to parse about 35'000 XML documents, valid document being ~20kB.我最近的任务需要解析大约 35'000 个 XML 文档,有效文档约为 20kB。
More and larger xml files tend to exponentially fill the memory:更多更大的 xml 文件往往会以指数方式填满 memory:
Try to distribute your work:尝试分发您的工作:
target
stores unstructured data, which eats most of computing power due to the data type and different length in values变量target
存储非结构化数据,由于数据类型和值的长度不同,它会消耗大部分计算能力CROSS APPLY
matters: avoid triple nodes in nodes()
, consider two nodes and recursion (see below on split) CROSS APPLY
中的节点深度很重要:避免在nodes()
中使用三重节点,考虑两个节点和递归(参见下面的拆分)WHERE id IN (1,2,3)
批处理模式:一次处理多个文档, WHERE id IN (1,2,3)
FOR
;循环文档列表, FOR
;DECLARE @xml_doc XML; SET @xml_doc = SELECT xmldata FROM xmlsource WHERE id=1;
使用局部变量解析,例如DECLARE @xml_doc XML; SET @xml_doc = SELECT xmldata FROM xmlsource WHERE id=1;
DECLARE @xml_doc XML; SET @xml_doc = SELECT xmldata FROM xmlsource WHERE id=1;
ROW_NUMBER()
, then LEFT JOIN
all parts to xml documents list using some identifier, such as xml_id
分别解析所有元素:使用 function ROW_NUMBER()
保存元素顺序,然后使用一些标识符将所有部分LEFT JOIN
连接到 xml 文档列表,例如xml_id
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.