[英]BigQuery Fuzzy Match Join Or Using A Range
在 Big Query 中,有沒有辦法在連接中使用模糊匹配,或者可能使用正則表達式來匹配一系列值?
例如,我有以下查詢,其中“持續時間”值可能相差 +/- 30,因此如果 callhistory.duration = 268 那么它會匹配 calltracking.duration = 292,后者在 238 到 298 的指定范圍內。
select
calltracking.date,
calltracking.calling_phone_number,
calltracking.duration,
callhistory.row_date,
callhistory.callid,
callhistory.calling_pty,
callhistory.duration,
calltracking.start_time_utc,
callhistory.segstart_utc
from
(SELECT
cast(date(start_time_local) as string) as date,
calling_phone_number,
start_time_utc,
duration,
utm_medium,
utm_source
FROM [xxx:calltracking.calls]) calltracking
left join
(select
*
FROM [xxx:datamart.callhistory]) callhistory
on (callhistory.calling_pty = calltracking.calling_phone_number) and
(callhistory.row_date = calltracking.date) and (callhistory.duration =
calltracking.duration)
下面的簡化示例適用於 BigQuery 標准 SQL
#standardSQL
WITH `xxx.calltracking.calls` AS (
SELECT 1 id, 292 duration
), `xxx:datamart.callhistory` AS (
SELECT 2 id, 268 duration
)
SELECT
t.id tid,
t.duration tduration,
h.id hid,
h.duration hduration
FROM `xxx.calltracking.calls` t
LEFT JOIN `xxx:datamart.callhistory` h
ON t.duration BETWEEN h.duration - 30 AND h.duration + 30
注意:這不適用於 BigQuery #legacySQL,它看起來像是您在問題中使用的
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.