简体   繁体   English

优化同一个表中的多个选择

[英]Optimizing multiple selects from the same table

I want to optimize the query that is consumed by a report. 我想优化报表使用的查询。 Unfortunately, I cannot modify the report, so I have to provide a specificly formatted dataset. 不幸的是,我无法修改报告,因此我必须提供特定格式的数据集。

So, let's say I have a table that looks like this(In practice, it has 25 columns and 20k rows): 所以,假设我有一个看起来像这样的表(实际上,它有25列和20k行):

Name    Description           Price     MiscColumn1    MiscColumn2    
Tea      test description      10       misc1            misc2   
Coffee    test desc            20       misc3            misc4
Water      test                20       misc1            misc2

So, I need to transform this dataset to look like this: 所以,我需要将此数据集转换为如下所示:

Type  Name    Description           Price     MiscColumn1    MiscColumn2 
  1    Tea     test description      NULL       NULL           NULL  
  1    Coffee   test desc            NULL       NULL           NULL 
  1    Water     test                NULL       NULL           NULL
  2    NULL      NULL                 10        NULL           NULL
  2    NULL      NULL                 20        NULL           NULL  
  3    NULL      NULL                NULL       misc1          misc2
  3    NULL      NULL                NULL       misc3          misc4  

So, basically what I need to do is to select 3 groups of distinct records back into the dataset. 所以,基本上我需要做的是选择3组不同的记录回到数据集中。

What I currently do is: 我现在做的是:

Create #tempTable  

And then do a 3 separate distinct selects like this: 然后像这样做3个独立的选择:

insert into #tempTable (Name, Description)  
select distinct Name, DEscription from myTable  
 insert into #tempTable (Price)  
select distinct Price from myTable   

But it is really slow and can take up to 5 seconds to complete with my data. 但它真的很慢,我的数据可能需要5秒钟才能完成。

Also, I was trying to use UNION, but I didn't gain any performance improvement. 此外,我试图使用UNION,但我没有获得任何性能提升。

You can do this in a single statement, which should involve a single scan, like this: 您可以在单个语句中执行此操作,该语句应包含单个扫描,如下所示:

SELECT DISTINCT
   X.*
FROM
   dbo.MyTable T
   CROSS APPLY (VALUES
      (1, T.Name, T.Description, NULL, NULL, NULL),
      (2, NULL, NULL, T.Price, NULL, NULL),
      (3, NULL, NULL, NULL, T.MiscColumn1, T.MiscColumn2)
   ) X (Type, Name, Description, Price, MiscColumn1, MiscColumn2)
;

See a Live Demo at SQL Fiddle 在SQL Fiddle上观看现场演示

Note that you don't need a temporary table--you can do your 15 joins and then in the CROSS APPLY simply refer to the table that each column comes from. 请注意,您不需要临时表 - 您可以执行15个连接,然后在CROSS APPLY只需参考每个列来自的表。

That brings up a point. 这提出了一个观点。 Your data is coming from 15 tables! 您的数据来自15个表格! If any of the Type groupings of values come from a distinct subset of tables, then this is probably not the best way to do it! 如果任何Type的值分组来自不同的表子集,那么这可能不是最好的方法! Let's say, for example, that MiscColumn1 and MiscColumn2 come from 2 tables that have no columns represented in another group. 例如,假设MiscColumn1MiscColumn2来自2个没有在另一个组中表示的列的表。 In that case, it will be much better to remove those 2 tables from the main query, and UNION ALL SELECT just the 2 columns from those tables separately. 在这种情况下,从主查询中删除这两个表会好得多,而UNION ALL SELECT则分别从这些表中UNION ALL SELECT 2列。

I'm saying this based on the possibly mistaken impression I am getting that your reporting platform is going to do its own joining of various related data. 我说这是基于我可能会错误的印象,你的报告平台将自己加入各种相关数据。 If so, then you shouldn't try to put together the unified view of all the data, then break it back down again--that is putting extra work on the system for no reason. 如果是这样,那么你不应该尝试将所有数据的统一视图放在一起,然后再将其重新打破 - 这无助于在系统上进行额外的工作。 The need for the DISTINCT in the above query highlights the extra memory, I/O, and CPU that will be required to materialize the trimmed-down result set you need. 上述查询中对DISTINCT的需求突出显示了实现所需的精简结果集所需的额外内存,I / O和CPU。 If there's any way to get around that, I think you should do it. 如果有办法解决这个问题,我认为你应该这样做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM