GROUP BY statement with a PIVOT command in T-SQL

Question

I have a SQL query that involves a PIVOT command that creates the correct table but now I need to GROUP BY one of the columns.

When I try to add the GROUP BY statement, it is returning the following error: "Column 'PivotTable.1' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."

SELECT [Load ID],[1],[2],[3],[4]
FROM TMS_Load_Stops
PIVOT (
    MIN([Stop Zip])
    for [Sequence] IN ([1],[2],[3],[4])
    ) PivotTable
;

The original code yields the below results:

And I would like the results to be as follows (values are random numbers for explanation purposes):

Answer 1

If you omit the Group by clause it automatically takes [Load ID] as grouping column. We have three types of columns in pivot - Grouping column, aggregating column and spanning column. Here the grouping is [Load ID], spanning is [Sequence] and aggregating is [Stop Zip]. Use the below query.

SELECT [Load ID],[1],[2],[3],[4]
FROM TMS_Load_Stops
PIVOT (
    MIN([Stop Zip])
    for [Sequence] IN ([1],[2],[3],[4])
    ) PivotTable

Answer 2

Use conditional aggregation. It is much simpler:

SELECT [Load ID],
       MIN(CASE WHEN [Sequence] = 1 THEN [Stop Zip] END) as [1],
       MIN(CASE WHEN [Sequence] = 2 THEN [Stop Zip] END) as [2],
       MIN(CASE WHEN [Sequence] = 3 THEN [Stop Zip] END) as [3],
       MIN(CASE WHEN [Sequence] = 4 THEN [Stop Zip] END) as [4],
FROM TMS_Load_Stops
GROUP BY [Load ID];

The PIVOT doesn't work because you have additional columns in the table being pivoted. I just don't like the syntax or how it works, but you can also fix it by only selecting the columns you need:

SELECT [Load ID], [1], [2], [3], [4]
FROM (SELECT [Load ID], [Sequence], [Stop Zip]
      FROM TMS_Load_Stops
     ) ls
PIVOT (
    MIN([Stop Zip])
    for [Sequence] IN ([1],[2],[3],[4])
    ) PivotTable
;

Answer 3

You have to "project away" any additional column in TMS_Load_Stops before the PIVOT because it already performs grouping - using all columns not mentioned in the PIVOT :

SELECT [Load ID],[1],[2],[3],[4]
FROM (select [Load ID],[Sequence],[Stop Zip] from TMS_Load_Stops) t
PIVOT (
    MIN([Stop Zip])
    for [Sequence] IN ([1],[2],[3],[4])
    ) PivotTable
;

Answer 4

Here's an example of specifying the aggregation method for each field & realiasing:

SELECT 
    [Load ID]
    ,sum([1]) AS [1]
    ,sum([2]) AS [2]
    ,sum([3]) AS [3]
    ,sum([4]) AS [4]
FROM (select * from TMS_Load_Stops) src
PIVOT (
    MIN([Stop Zip])
    for [Sequence] IN ([1],[2],[3],[4])
    ) PivotTable
GROUP BY [Load ID]
;

In lines 3-6 we've added an aggregation method 'SUM' to sum the field records upon aggregation. This way, the interpreter knows how to aggregate when grouping with 'GROUP BY'.

^ We want to steer clear of conditional 'CASE WHEN' if at all possible since its computationally expensive. PIVOT works like expected; it unstacks the field-value relationship into their own columns.

You can also reference the pivot table as a 'common table expression' by calling it again. This way the 'GROUP BY' syntax is implicit. It would look like this:

Select * from (
    SELECT 
        [Load ID]
        ,[1]
        ,[2]
        ,[3]
        ,[4]
    FROM (select * from TMS_Load_Stops) src
    PIVOT (
        MIN([Stop Zip])
        for [Sequence] IN ([1],[2],[3],[4])
        ) PivotTable 
) AggregatedPivot
    ;

GROUP BY statement with a PIVOT command in T-SQL

Question

4 answers

solution1
2 2019-01-11 14:53:41

solution2
1 ACCPTED 2019-01-11 14:57:24

solution3
1 2019-01-11 14:58:18

solution4
0 2022-01-31 22:14:59

GROUP BY statement with a PIVOT command in T-SQL

Question

4 answers

solution1 2 2019-01-11 14:53:41

solution2 1 ACCPTED 2019-01-11 14:57:24

solution3 1 2019-01-11 14:58:18

solution4 0 2022-01-31 22:14:59

solution1
2 2019-01-11 14:53:41

solution2
1 ACCPTED 2019-01-11 14:57:24

solution3
1 2019-01-11 14:58:18

solution4
0 2022-01-31 22:14:59