Combine multiple rows with N-1 identical columns and 1 different column into one row, preserving the first N-1 columns and summing the last column

Question

I have a query that produces a table with 26 columns, AZ. For some rows, columns AY are identical, and column Z is the only one that differs. Is there an easy and clean way to combine duplicate rows, such that columns AY are the same and column Z is summed over? My solution is to do something like

SELECT A, B, C,...,Y,SUM(Z)
-- lots of work
FROM [table produced by multiple joins]
GROUP BY A, B, C,...,Y

The last GROUP BY clause ends up being very long. It's also prone to making mistakes if columns are ever added or removed from the SELECT statement. Is this the only way to go about what I want to do?

Answer 1

Below is for BigQuery Standard SQL

#standardSQL
SELECT 
  ANY_VALUE((SELECT AS STRUCT t.* EXCEPT(z))).*,
  SUM(z) AS z
FROM `project.dataset.table_produced_by_multiple_joins` t
GROUP BY FORMAT('%t', (SELECT AS STRUCT t.* EXCEPT(z)))

Combine multiple rows with N-1 identical columns and 1 different column into one row, preserving the first N-1 columns and summing the last column

Question

1 answers

solution1
2 ACCPTED 2020-09-11 20:22:56

Combine multiple rows with N-1 identical columns and 1 different column into one row, preserving the first N-1 columns and summing the last column

Question

1 answers

solution1 2 ACCPTED 2020-09-11 20:22:56

solution1
2 ACCPTED 2020-09-11 20:22:56