简体   繁体   English

在 t-Sql 查询中使用计数和 SUM 透视/条件聚合数据

[英]Pivot/Conditional aggregation data with count and SUM In t-Sql Query

Let's say we have data as below假设我们有如下数据

EmpId   Stages  Amount
101     Stg1    10.2
101     Stg2    10.22
101     Stg1    10.11
101     Stg3    6.21
101     Stg2    3.22
102     Stg1    3.23
102     Stg2    2.22
102     Stg3    1.22
102     Stg3    3.22

And the result required: Each employee wise stages count and stage wise sum amount as below所需的结果:每个员工的明智阶段计数和阶段总和金额如下

EmpId   Stg1-Count  Amount-stg1 Stg2-Count  Amount-stg2 Stg3-Count  Amount-stg3
101         2       20.31           2           13.44       1           6.21
102         1       3.23            1           2.22        2           4.44

Here is an approach that uses the straightforward syntax of conditional aggregations.这是一种使用条件聚合的简单语法的方法。

/* Set up the sample data */
DROP TABLE IF EXISTS #RawData;
CREATE TABLE #RawData
 (
     EmpId INT,
     Stages NVARCHAR(5),
     Amount DECIMAL(9, 4)
 );
INSERT INTO #RawData
VALUES (101, N'Stg1', 10.2),
       (101, N'Stg2', 10.22),
       (101, N'Stg1', 10.11),
       (101, N'Stg3', 6.21),
       (101, N'Stg2', 3.22),
       (102, N'Stg1', 3.23),
       (102, N'Stg2', 2.22),
       (102, N'Stg3', 1.22),
       (102, N'Stg3', 3.22),
       (103, N'Stg1', 1.55);

I added a row to ensure we're handling incomplete and in-process stage progressions.我添加了一行以确保我们处理不完整和正在进行的阶段进展。

This assumes that there are a fixed number of "stages" or categories that will be counted and summed.这假设有固定数量的“阶段”或类别将被计算和汇总。 If there are a large or changing number of values, this can become tedious to set up, but the code is pretty straightforward:如果有大量或不断变化的值,设置起来可能会变得乏味,但代码非常简单:

SELECT d.EmpId,
       SUM(CASE WHEN d.Stages = 'Stg1' THEN 1 ELSE 0 END)        AS [Stg1-Count],
       SUM(CASE WHEN d.Stages = 'Stg1' THEN ISNULL(d.Amount, 0) ELSE 0 END) AS [Amount-Stg1],
       SUM(CASE WHEN d.Stages = 'Stg2' THEN 1 ELSE 0 END)        AS [Stg2-Count],
       SUM(CASE WHEN d.Stages = 'Stg2' THEN ISNULL(d.Amount, 0) ELSE 0 END) AS [Amount-Stg2],
       SUM(CASE WHEN d.Stages = 'Stg3' THEN 1 ELSE 0 END)        AS [Stg3-Count],
       SUM(CASE WHEN d.Stages = 'Stg3' THEN ISNULL(d.Amount, 0) ELSE 0 END) AS [Amount-Stg3]
  FROM #RawData AS d
 GROUP BY d.EmpId;
EmpId EmpId Stg1-Count Stg1-计数 Amount-Stg1金额-Stg1 Stg2-Count Stg2-计数 Amount-Stg2金额-Stg2 Stg3-Count Stg3-计数 Amount-Stg3金额-Stg3
101 101 2 2 20.3100 20.3100 2 2 13.4400 13.4400 1 1 6.2100 6.2100
102 102 1 1 3.2300 3.2300 1 1 2.2200 2.2200 2 2 4.4400 4.4400
103 103 1 1 1.5500 1.5500 0 0 0.0000 0.0000 0 0 0.0000 0.0000

It uses a GROUP BY on the EmpId to aggregate everything per employee, and CASE s are evaluated to construct the additional columns.它在 EmpId 上使用GROUP BY来聚合每个员工的所有内容,并评估CASE以构造其他列。 Using a SUM rather than a COUNT provides for a conditional counting, otherwise we'd just get the count of all rows for that employee regardless of the 'Stage' value.使用SUM而不是COUNT提供了条件计数,否则我们只会获取该员工的所有行的计数,而不管“阶段”值如何。

One option for this is to use dynamic SQL and PIVOT to get a more elastic result set.一种选择是使用动态 SQL 和PIVOT来获得更具弹性的结果集。

Using an adapted version of your result set that includes some new 'Stages' values使用包含一些新“阶段”值的结果集的改编版本

/* Set up the sample data */
DROP TABLE IF EXISTS #RawData;

CREATE TABLE #RawData
     (
         EmpId INT,
         Stages NVARCHAR(5),
         Amount DECIMAL(9, 4)
     );

DECLARE @SqlStatement NVARCHAR(MAX);

INSERT INTO #RawData
VALUES (101, N'Stg1', 10.2),
       (101, N'Stg2', 10.22),
       (101, N'Stg1', 10.11),
       (101, N'Stg3', 6.21),
       (101, N'Stg2', 3.22),
       (102, N'Stg1', 3.23),
       (102, N'Stg2', 2.22),
       (102, N'Stg3', 1.22),
       (102, N'Stg3', 3.22),
       (103, N'Stg0', 4.00),
       (103, N'Stg1', 1.55),
       (103, N'Stg4', 14.00);

Next, we'll need some constructs for working with those data in dynamic SQL:接下来,我们需要一些结构来处理动态 SQL 中的这些数据:

/* Set up some variables to store comma-delimited values and column names */
    DECLARE @stages      NVARCHAR(255),
            @amountAlias NVARCHAR(255),
            @countAlias  NVARCHAR(255),
            @resultCols  NVARCHAR(255);

/* Store comma-delimited column names and selection aliases in variables for use in the dynamic blocks */
SELECT @stages      = STRING_AGG(d.Stages, ','),
       @amountAlias = STRING_AGG(CONCAT(N'ISNULL(a.', d.Stages, N', 0) AS [Amount-', d.Stages, N']'), ', '),
       @countAlias  = STRING_AGG(CONCAT(N'ISNULL(c.', d.Stages, N', 0) AS [', d.Stages, N'-Count]'), ', '),
       @resultCols  = STRING_AGG(CONCAT(N'c.[', d.Stages, N'-Count], a.[Amount-', d.Stages, N']'), ', ')
  FROM (SELECT DISTINCT Stages FROM #RawData) AS d;

Then, it's a matter of constructing the dynamic code using those variables:然后,就是使用这些变量构建动态代码的问题:

SELECT @SqlStatement = N';WITH Amounts AS (SELECT a.EmpId,' + @amountAlias
                       + N'
  FROM (SELECT EmpId, Stages, ISNULL(Amount, 0) AS Amount FROM #RawData) AS src
  PIVOT (SUM(src.Amount)
FOR src.Stages IN ('   + @stages + N') ) AS a),
    Counts AS (SELECT c.EmpId,' + @countAlias
                       + N'
  FROM (SELECT EmpId, Stages, ISNULL(Amount, 0) AS Amount FROM #RawData) AS src
  PIVOT (COUNT(src.Amount)
FOR src.Stages IN ('   + @stages + N') ) AS c)
SELECT a.EmpId, '      + @resultCols + N'
  FROM Amounts AS a
  JOIN Counts AS c
    ON a.EmpId = c.EmpId;';

Finally, you can execute the dynamic SQL using EXEC (@SqlStatement);最后,您可以使用EXEC (@SqlStatement);

In this example, there are two CTEs defined using PIVOT operations: one for the amount aggregate, and the other for the counts.在此示例中,使用PIVOT操作定义了两个 CTE:一个用于金额聚合,另一个用于计数。 The column-name strings are constructed using the distinct 'Stages' values, and are aliased with the associated aggregation-type names.列名字符串是使用不同的“阶段”值构造的,并使用关联的聚合类型名称作为别名。

This solution has the advantage of adjusting to and returning new 'Stages' values, but is more complicated and generates a more complex execution plan, more table scans, more logical reads, and at a larger scale will run into cardinality estimation limitations.此解决方案具有调整并返回新的“阶段”值的优点,但更复杂并生成更复杂的执行计划、更多的表扫描、更多的逻辑读取,并且在更大规模时会遇到基数估计限制。

| EmpId | Stg0-Count | Amount-Stg0 | Stg1-Count | Amount-Stg1 | Stg2-Count | Amount-Stg2 | Stg3-Count | Amount-Stg3 | Stg4-Count | Amount-Stg4 |
|-------|------------|-------------|------------|-------------|------------|-------------|------------|-------------|------------|-------------|
|   101 |          0 |      0.0000 |          2 |     20.3100 |          2 |     13.4400 |          1 |      6.2100 |          0 |      0.0000 |
|   102 |          0 |      0.0000 |          1 |      3.2300 |          1 |      2.2200 |          2 |      4.4400 |          0 |      0.0000 |
|   103 |          1 |      4.0000 |          1 |      1.5500 |          0 |      0.0000 |          0 |      0.0000 |          1 |     14.0000 |

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM