[英]Can I include a non-aggregated Column in an aggregate function in SQL without putting it into a GROUP BY clause?
[英]BigQuery : GROUP BY include the non-aggregated column
假設我有一張如下表:
| sku | date | time | inc_col| latest_col1 |latest_col2|
+-----+----------+-------+--------+-------------+-----------+
|1 |2020-10-26| 08:00 | 100 | 10 |a |
|1 |2020-10-26| 10:00 | -10 | 11 |b |
|1 |2020-10-26| 06:00 | 5 | 7 |c |
|2 |2020-10-26| 08:00 | 300 | 4 |x |
|2 |2020-10-26| 10:00 |-100 | 4 |y |
|2 |2020-10-26| 03:00 | 10 | 8 |z |
現在這個查詢將產生以下結果:
SELECT sku,date,SUM(inc_col) from tbl GROUP BY sku,date
;
|sku |date | inc_col|
+----+-----------+--------+
|1 |2020-10-26 | 105 |
|2 |2020-10-26 | 210 |
是否可以包含 'latest_col1','latest_col2' ORDERED BY "time" 列的最后一個值,如下所示:
|sku |date |inc_col|latest_col1| latest_col2|
+----+----------+-------+-----------+------------+
|1 |2020-10-26| 105 | 11 | b |
|2 |2020-10-26| 210 | 4 | y |
是否可以使用任何 WINDOWING 函數來實現這一點? 該表有數百個類型為“inc_col”和“latest_col”的列。
您可以使用窗函數 SUM 計算 inc_col 和窗函數 LAST_VALUE 來查找 latest_col1、latest_col2:
SELECT
DISTINCT sku,
date,
SUM(inc_col) OVER (PARTITION BY sku, date) AS inc_col,
FIRST_VALUE(latest_col1) OVER (PARTITION BY sku, date ORDER BY time DESC) AS latest_col1,
FIRST_VALUE(latest_col2) OVER (PARTITION BY sku, date ORDER BY time desc) AS latest_col2
FROM tbl;
數組函數是執行此操作的一種簡單方法:
SELECT sku, date, SUM(inc_col),
ARRAY_AGG(latest_col1 ORDER BY time)[ORDINAL(1)] as latest_col1,
ARRAY_AGG(latest_col2 ORDER BY time)[ORDINAL(1)] as latest_col1
FROM tbl
GROUP BY sku, date;
事實上,如果您願意,您可以獲得整個最近的行:
SELECT sku, date, SUM(inc_col),
(ARRAY_AGG(t ORDER BY time)[ORDINAL(1)]).* as latest_rec
FROM tbl t
GROUP BY sku, date;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.