[英]How to join two tables by dependent match keys in BigQuery?
I have two tables in BigQuery First one is a list of rates.我在 BigQuery 中有两个表第一个是费率列表。 Rates have default values with source
equal -1
for each combo code - offer
.对于每个组合code - offer
费率具有默认值,其中source
等于-1
。 Apart from combo code - offer
, some rates have specified source
除组合code - offer
,部分价格已指定source
Second table has same columns as first table except rates + any other data.第二个表与第一个表具有相同的列,除了比率 + 任何其他数据。
My goal join rates by matched code - offer - source
otherwise use default rate by matched code - offer
with source
equal -1
我的目标通过匹配的连接率code - offer - source
以其它方式使用通过匹配的违约率code - offer
与source
等于-1
In example query returns default rates only:在示例查询中仅返回默认费率:
WITH t1 AS (SELECT 21 as source, 'SA' as code, 'offer1' as offer, 2.4 as rate
UNION ALL
SELECT 33, 'SA', 'offer1', 2.5
UNION ALL
SELECT 39, 'SA', 'offer1', 2.1
UNION ALL
SELECT -1, 'SA', 'offer1', 3
UNION ALL
SELECT -1, 'SA', 'offer2', 4
UNION ALL
SELECT 47, 'YN', 'offer1', 2.7
UNION ALL
SELECT -1, 'YN', 'offer1', 5.4
UNION ALL
SELECT -1, 'YN', 'offer2', 0.9
UNION ALL
SELECT -1, 'RE', 'offer1', 5.7
UNION ALL
SELECT -1, 'RE', 'offer2', 3.4),
t2 as (SELECT 21 as source, 'SA' as code, 'offer1' as offer, "any data" as other_columns
UNION ALL SELECT 21, 'SA', 'offer1', "any data"
UNION ALL SELECT 21, 'SA', 'offer1', "any data"
UNION ALL SELECT 21, 'SA', 'offer2', "any data"
UNION ALL SELECT 47, 'YN', 'offer1', "any data"
UNION ALL SELECT 47, 'YN', 'offer2', "any data"
UNION ALL SELECT 50, 'YN', 'offer1', "any data"
UNION ALL SELECT 47, 'YN', 'offer2', "any data"
UNION ALL SELECT 78, 'RE', 'offer1', "any data"
UNION ALL SELECT 66, 'RE', 'offer2', "any data")
SELECT t2.*, rate FROM t2
LEFT JOIN t1 ON t1.offer = t2.offer AND t1.code = t2.code AND IF (t1.source = t1.source AND rate IS NULL, t1.source = t2.source, t1.source = - 1)
Next query returns rates with specified source
and null
when source
did not match当source
不匹配时,下一个查询返回具有指定source
和null
费率
SELECT t2.*, rate FROM t2
LEFT JOIN t1 ON t1.offer = t2.offer AND t1.code = t2.code AND IF (t1.source = t1.source AND rate IS NOT NULL, t1.source = t2.source, t1.source = - 1)
How can I join rates correct?我怎样才能加入正确的费率?
You can left join
twice and use conditional logic:您可以left join
两次并使用条件逻辑:
select t2.*, coalesce(t11.rate, t12.rate) rate
from t2
left join t1 t11
on t11.code = t2.code
and t11.offer = t2.offer
and t11.source = t2.source
left join t1 t12
on t12.code = t2.code
and t12.offer = t2.offer
and t12.source = -1
and t11.code is null
Below is for BigQuery Standard SQL下面是 BigQuery 标准 SQL
#standardSQL
select any_value(t2).*,
array_agg(rate order by t1.source = t2.source desc, t1.source = -1 desc limit 1)[offset(0)] rate
from t2
left join t1
on t1.code = t2.code
and t1.offer = t2.offer
group by format('%t', t2)
if applied to sample data from your question - output is as below如果应用于您问题中的样本数据 - 输出如下
Above avoids double joining, the only side effect here is - result is deduped - meaning duplicate rows - which are present in the table 2 - are deduped / eliminated以上避免了双重连接,这里唯一的副作用是 - 结果被删除 - 意味着重复行 - 存在于表 2 中 - 被删除/删除
I need duplicate rows我需要重复的行
Sure, just almost no changes to above gives you all rows当然,对上面几乎没有任何变化为您提供所有行
#standardSQL
select any_value(t2).*,
array_agg(rate order by t1.source = t2.source desc, t1.source = -1 desc limit 1)[offset(0)] rate
from t2, unnest([rand()]) as r
left join t1
on t1.code = t2.code
and t1.offer = t2.offer
group by format('%t', t2), r
with output带输出
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.