[英]Optimizing multiple selects from the same table
I want to optimize the query that is consumed by a report. 我想优化报表使用的查询。 Unfortunately, I cannot modify the report, so I have to provide a specificly formatted dataset.
不幸的是,我无法修改报告,因此我必须提供特定格式的数据集。
So, let's say I have a table that looks like this(In practice, it has 25 columns and 20k rows): 所以,假设我有一个看起来像这样的表(实际上,它有25列和20k行):
Name Description Price MiscColumn1 MiscColumn2
Tea test description 10 misc1 misc2
Coffee test desc 20 misc3 misc4
Water test 20 misc1 misc2
So, I need to transform this dataset to look like this: 所以,我需要将此数据集转换为如下所示:
Type Name Description Price MiscColumn1 MiscColumn2
1 Tea test description NULL NULL NULL
1 Coffee test desc NULL NULL NULL
1 Water test NULL NULL NULL
2 NULL NULL 10 NULL NULL
2 NULL NULL 20 NULL NULL
3 NULL NULL NULL misc1 misc2
3 NULL NULL NULL misc3 misc4
So, basically what I need to do is to select 3 groups of distinct records back into the dataset. 所以,基本上我需要做的是选择3组不同的记录回到数据集中。
What I currently do is: 我现在做的是:
Create #tempTable
And then do a 3 separate distinct selects like this: 然后像这样做3个独立的选择:
insert into #tempTable (Name, Description)
select distinct Name, DEscription from myTable
insert into #tempTable (Price)
select distinct Price from myTable
But it is really slow and can take up to 5 seconds to complete with my data. 但它真的很慢,我的数据可能需要5秒钟才能完成。
Also, I was trying to use UNION, but I didn't gain any performance improvement. 此外,我试图使用UNION,但我没有获得任何性能提升。
You can do this in a single statement, which should involve a single scan, like this: 您可以在单个语句中执行此操作,该语句应包含单个扫描,如下所示:
SELECT DISTINCT
X.*
FROM
dbo.MyTable T
CROSS APPLY (VALUES
(1, T.Name, T.Description, NULL, NULL, NULL),
(2, NULL, NULL, T.Price, NULL, NULL),
(3, NULL, NULL, NULL, T.MiscColumn1, T.MiscColumn2)
) X (Type, Name, Description, Price, MiscColumn1, MiscColumn2)
;
Note that you don't need a temporary table--you can do your 15 joins and then in the CROSS APPLY
simply refer to the table that each column comes from. 请注意,您不需要临时表 - 您可以执行15个连接,然后在
CROSS APPLY
只需参考每个列来自的表。
That brings up a point. 这提出了一个观点。 Your data is coming from 15 tables!
您的数据来自15个表格! If any of the
Type
groupings of values come from a distinct subset of tables, then this is probably not the best way to do it! 如果任何
Type
的值分组来自不同的表子集,那么这可能不是最好的方法! Let's say, for example, that MiscColumn1
and MiscColumn2
come from 2 tables that have no columns represented in another group. 例如,假设
MiscColumn1
和MiscColumn2
来自2个没有在另一个组中表示的列的表。 In that case, it will be much better to remove those 2 tables from the main query, and UNION ALL SELECT
just the 2 columns from those tables separately. 在这种情况下,从主查询中删除这两个表会好得多,而
UNION ALL SELECT
则分别从这些表中UNION ALL SELECT
2列。
I'm saying this based on the possibly mistaken impression I am getting that your reporting platform is going to do its own joining of various related data. 我说这是基于我可能会错误的印象,你的报告平台将自己加入各种相关数据。 If so, then you shouldn't try to put together the unified view of all the data, then break it back down again--that is putting extra work on the system for no reason.
如果是这样,那么你不应该尝试将所有数据的统一视图放在一起,然后再将其重新打破 - 这无助于在系统上进行额外的工作。 The need for the
DISTINCT
in the above query highlights the extra memory, I/O, and CPU that will be required to materialize the trimmed-down result set you need. 上述查询中对
DISTINCT
的需求突出显示了实现所需的精简结果集所需的额外内存,I / O和CPU。 If there's any way to get around that, I think you should do it. 如果有办法解决这个问题,我认为你应该这样做。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.