[英]BigQuery : Scalar Subquery produced more than one - aggregating datetime to array in time interval
我正在尝试查找在特定时间间隔(每行不同的间隔)中发生的事件,并将其添加为一列。 最后附上的两个表:(1)时间间隔,(2)日期时间事件
首先,我为每一行添加了一个包含所有数据时间事件的列作为数组。 例子
其次,我使用此代码计算每个间隔中有多少个日期时间:
-- count how many urine-output chart-events for every hourly interval
SELECT T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH,
sum((SELECT count(*) FROM UNNEST(twi.ca) as x WHERE x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH)) NUMBER_OF_OUTPUTS_IN_INTERVAL
FROM TIMES_WITH_INTERVALS twi
group by T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH
当我尝试添加另一列与间隔中出现的日期时间(在数组中)时,我得到:
标量子查询产生了多个元素
这是我使用的代码:
SELECT T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH,
sum((SELECT count(*) FROM UNNEST(twi.ca) as x WHERE x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH)) NUMBER_OF_OUTPUTS_IN_INTERVAL,
ARRAY_AGG(FORMAT("%T",(SELECT * FROM UNNEST(twi.ca) as x WHERE x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH))) AS ARRAY_OF_TIMES_IN_INTERVALL
FROM TIMES_WITH_INTERVALS twi
group by T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH
更大的图景
我有一张带有日期时间戳+体积测量的表格(最后的图2)。 我想将测量的值聚合到舍入的时间间隔中,这意味着能够对每个事件的测量值进行算术运算。
我对不同的方法持开放态度。
我想做的计算——总和:
我的终极解决方案
感谢 Gordon Linoff 的大力帮助,我能够正确运行我的代码。 当我继续处理“更大的图片”时,我需要一个对未嵌套的 arra 具有更多可变性的解决方案——最终将 Gordon Linoff 解决方案与 CASE 结合起来:
SELECT *,
COUNT(case when x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH then uo.charttime end) AS NUMBER_OF_OUTPUTS_IN_INTERVAL,
ARRAY_AGG(case when x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH then uo.charttime end IGNORE NULLS) AS ARRAY_OF_TIMES_IN_INTERVALL,
ARRAY_AGG(case when x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH then uo.value end IGNORE NULLS) AS ARRAY_OF_UO,
ARRAY_REVERSE(ARRAY_AGG(case when x <= twi.TIME_INTERVAL_STARTS then x end IGNORE NULLS))[OFFSET(0)] AS TIME_BEFORE,
ARRAY_AGG(case when x > twi.TIME_INTERVAL_FINISH then x end IGNORE NULLS)[OFFSET(0)] AS TIME_AFTER,
ARRAY_AGG(case when x > twi.TIME_INTERVAL_FINISH then uo.value end IGNORE NULLS)[OFFSET(0)] AS UO_AFTER,
FROM TIMES_WITH_INTERVALS twi
LEFT JOIN UNNEST(twi.ca) x
ON true
LEFT JOIN uo
ON uo.charttime = x
GROUP BY T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH
ORDER BY T_PLUS
原表截图:
将format()
和array_agg()
放在子查询中:
(SELECT ARRAY_AGG(FORMAT('%T', x))
FROM
WHERE x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH
) AS ARRAY_OF_TIMES_IN_INTERVAL
也就是说,我很不清楚为什么要将时间戳转换为字符串以存储在数组中。 您可以只拥有一个本机类型的数组。
编辑:
您的查询应该如下所示:
SELECT T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH,
COUNT(x) as cnt,
ARRAY_AGG(x) as timestamps
FROM TIMES_WITH_INTERVALS twi LEFT JOIN
UNNEST(twi.ca) x
ON x BETWEEN twi.TIME_INTERVAL_STARTS AND twi.TIME_INTERVAL_FINISH
GROUP BY T_PLUS, START_TIME_ROUNDED_UP, TIME_INTERVAL_STARTS, TIME_INTERVAL_FINISH;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.