[英]First non Null value (ordered) aggregate function
给出GBQ中的下表
Element, tmed, ingestion_time
Item1, 10.0, 2023-01-01
Item1, 11.0, 2023-01-02
Item2, null, 2023-01-02
Item2, 20.0 ,2023-01-03
Item3, 21.0, 2023-01-03
Item3, null, 2023-01-04
Item4, null, 2023-01-04
Item4, null, 2023-01-05
我想检索最新的非空值(使用最新的ingestion_time )。 这将检索以下结果:
Element, tmed, ingestion_time
Item1, 11.0, 2023-01-02
Item2, 20.0, 2023-01-03
Item3, 21.0, 2023-01-03
Item4, null, 2023-01-05
为此,我使用了聚合 function ANY_VALUE
,即使文档没有非常清楚地显示,它也采用第一个非空值(检查此处的讨论)然而,它只采用第一个非空值,独立于DATETIME 字段ingestion_time 。 我尝试了不同的ORDER BY
选项,但没有成功。
尝试使用 row_number function 如下:
select element, tmed, ingestion_time
from
(
select *,
row_number() over (partition by element order by case when tmed is not null then 1 else 2 end, ingestion_time desc) rn
from table_name
) T
where rn = 1
您可以在QUALIFY
子句中使用ROW_NUMBER
window function,如下所示:
tmed is NULL
(拉低您的 null 值),ingestion_time ingestion_time DESC
(拉高您的日期)SELECT *
FROM tab
QUALIFY ROW_NUMBER() OVER(PARTITION BY Element ORDER BY tmed IS NULL, ingestion_time DESC) = 1
所有解决方案都简单有效。 尽管如此,为了将其推广到更多领域而不仅仅是tmed ,我找到了以下解决方案:
WITH overwritten_original_table AS (
SELECT * EXCEPT(tmed),
FIRST_VALUE(tmed IGNORE NULLS) OVER (PARTITION BY element ORDER BY ingestion_time DESC) AS tmed
-- Here, you can add more fields with the same FIRST_VALUE logic
FROM original_table
)
SELECT
element,
ANY_VALUE(tmed) AS tmed,
-- Here, you can add more fields with the ANY_VALUE logic
MAX(ingestion_time) AS ingestion_time
FROM overwritten_original_table
GROUP BY fecha
由于它是针对多个字段的解决方案,因此我只采用了最大 ingestion_time,但您可以修改它以获得每个字段的 ingestion_time。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.