简体   繁体   English

条件聚合数据库查询及其性能含义

[英]Conditional aggregate database queries and their performance implications

I think this question is best asked with an example: if you want two counts from a table - say one with all the rows with a bit flag set to false and another with all of the ones set to true - is there a best practice for this kind of query and what are the performance implications of any approaches that could be taken? 我认为最好用一个例子来问这个问题:如果您要从一个表中获得两个计数-说一个计数,所有行的位标志都设置为false,而另一个将所有行的true设置为true-是否有最佳做法这种查询以及可以采用的任何方法的性能含义是什么?

To expand a little, and basing it off of the article below, how would separate queries compare to the version with the CASE evaluation in the SELECT list from a performance point of view? 为了扩展一点,并以下面的文章为基础,从性能的角度来看,如何将单独的查询与SELECT列表中具有CASE评估的版本进行比较? Are there other methods? 还有其他方法吗?

http://www.codeproject.com/Articles/310674/Conditional-Sums-in-SQL-Aggregate-Methods http://www.codeproject.com/Articles/310674/Conditional-Sums-in-SQL-Aggregate-Methods

SELECT [bitCol], count(*)
  FROM [table]
 GROUP BY [bitCol]

If that column is indexed it is an index scan followed by a stream aggregate. 如果对该列建立了索引,则它是索引扫描,然后是流聚合。
Doubt you can do better than that 怀疑你能做得更好

Other than Blam's way, I think there are three basic ways to get the desired result. 除了Blam的方法之外,我认为还有三种基本方法可以达到预期的效果。 I tested the three options below as well as Blam's on my system. 我在系统上测试了以下三个选项以及Blam的选项。 The results I found were as follows. 我发现的结果如下。 Also, a side note, we didn't have any bit data in our system so I counted an indexed column with two values ("H" or "R"). 另外,请注意,我们的系统中没有位数据,因此我对带有两个值(“ H”或“ R”)的索引列进行了计数。

Using Conditional Aggregates method resulted in the fastest performance. 使用条件聚合方法可获得最快的性能。 Using Blam's Grouping with an Aggregate method resulted in the second fastest way, consistently taking about 33% longer than the Conditional Aggregates. 将Blam的分组与汇总方法结合使用可产生第二快的方式,持续时间比条件汇总长约33%。 The Two Separate Select Statements method was the third fastest, consistently taking close to 50% longer than the Conditional Aggregates. “两个独立的选择语句”方法是第三快的方法,始终比条件聚合长近50%。 Finally, the Joins method took the longest, and was close to 1000% slower than the Conditional Aggregates. 最后,Joins方法花费的时间最长,并且比条件聚合慢了近1000%。 The joins were expected (by me) to take the longest as you're joining to that table multiple times. 由于我多次(多次)联接到该表,因此(我认为)联接需要的时间最长。 The reason I included this method is because it was not discussed (possibly for obvious reasons) in the question; 我之所以使用这种方法,是因为问题中没有讨论(可能是出于明显的原因)。 all performance issues aside, it is a viable if not extremely slow option. 除了所有性能问题,如果不是很慢的话,这是一个可行的选择。 The two separate select statements also makes sense as you're running two separate aggregates, accessing that table two separate times. 当您运行两个单独的聚合,分别两次访问该表时,两个单独的select语句也很有意义。

I'm not sure what accounts for the differences between the conditional aggregate method and Blam's method. 我不确定是什么导致了条件聚合方法与Blam方法之间的差异。 I've always been pleasantly surprised by the speed and performance of case statements, and today was no different. 我一直对案例陈述的速度和性能感到惊讶,而今天也是如此。

I think the case statement method, aside from the performance considerations, is possibly the most versatile method. 我认为,除了性能方面的考虑外,案例陈述方法可能是最通用的方法。 It allows you to work with just about any type of field and facilitates the selection of a subset of values, whereas Blam's Grouping with an Aggregate method would show all possible column values unless a Where clause were included. 它使您几乎可以处理任何类型的字段,并便于选择值的子集,而Blam的“使用汇总进行分组”方法将显示所有可能的列值,除非包括Where子句。

Conditional Aggregates 条件聚合

Select SUM(Case When bitcol = 1 Then 1 Else 0 End) as True_Count
    , SUM(Case When bitcol = 0 Then 1 Else 0 End) as False_Count

From Table;

Two separate select statements 两个单独的选择语句

Select Count(1) as True_Count

From Table

Where bitcol = 1;

Select Count(1) as False_Count

From Table

Where bitcol = 0;

Using Joins 使用联接

Select Count(T2.bitcol) as True_Count
    , Count(T3.bitcol) as False_Count

From Table T1
Left Outer Join Table T2
    on T1.ID = T2.ID
Left Outer Join Table T3
    on T1.ID = T3.ID;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM