简体   繁体   English

BigQuery 如何根据重叠间隔连接 2 个表?

[英]BigQuery how to join 2 tables based on overlapping intervals?

I need to convert this join from Postgres to BigQuery:我需要将此连接从 Postgres 转换为 BigQuery:

-- postgres
SELECT *
FROM table_a
INNER JOIN table_b
ON tstzrange(table_a.start, table_a.end) && tstzrange(table_b.start, table_b.end)

I've tried the 4 overlapping cases but in BigQuery it doesn't work like that.我已经尝试了 4 个重叠的案例,但在 BigQuery 中它不是那样工作的。 My result always ends up missing some windows.我的结果总是缺少一些 windows。

INNER JOIN table_b
ON 
-- case 1
a.start <= b.start AND b.end <= a.end
OR
-- case 2
a.start <= b.start AND b.start <= a.end
OR 
-- case 3
b.start <= a.start AND a.end <= b.end
OR
-- case 4
b.start <= a.start AND a.start <= b.end

Consider below condition to check if there is an overlapped period.考虑以下条件以检查是否存在重叠期。

  • a.start <= b.end AND b.start <= a.end

sample query is示例查询是

WITH table_a AS (
  SELECT TIMESTAMP '2023-01-01 09:00:00' start, TIMESTAMP '2023-01-10 09:00:00' `end` UNION ALL
  SELECT '2023-01-05 09:00:00' start, '2023-01-15 09:00:00' `end` 
),
table_b AS (
  SELECT TIMESTAMP '2022-12-15 09:00:00' start, TIMESTAMP '2023-01-04 09:00:00' `end` UNION ALL 
  SELECT '2023-01-12 09:00:00' start, '2023-01-20 09:00:00' `end`
)
SELECT * FROM table_a a JOIN table_b b ON a.start <= b.end AND b.start <= a.end;

Query results查询结果

在此处输入图像描述

The issue was the NULL value is automatically handled as lower and upper bound in Postgres tstzrange .问题是 NULL 值在 Postgres tstzrange中自动处理为下限和上限。 It's not handled in basic queries or in BQ.它不在基本查询或 BQ 中处理。 Using COALESCE() solved the issue.使用COALESCE()解决了这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM