[英]Unable to join using wildcards in BigQuery
I am trying to join two tables in big query, Table1 contains an ID column, and Table2 contains a column which has the same ID or multiple ID's in the form of a long string separated by commas, like "id123,id456,id678" 我正在尝试在大查询中联接两个表,Table1包含一个ID列,Table2包含一个具有相同ID或多个ID的列,这些列以逗号分隔的长字符串形式出现,例如“ id123,id456,id678”
I can join the tables together if Table1.ID = Table2.ID but this ignores all the rows where Table1.ID is one of the multiple IDs in Table2.ID. 如果Table1.ID = Table2.ID,我可以将表连接在一起,但这会忽略所有表,其中Table1.ID是Table2.ID中的多个ID之一。 I have looked at similar post that tell me to use wildcards like
我看过类似的帖子,告诉我使用通配符,例如
on concat('%',Table1.ID,'%') = Table2.ID
but this does not work, because it seems to create a string that contains the '%' character and doesn't actually use it as a wildcard. 但这不起作用,因为它似乎创建了一个包含'%'字符的字符串,并且实际上并未将其用作通配符。
I'm using standard sql in BigQuery, any help would be appreciated 我在BigQuery中使用标准sql,任何帮助将不胜感激
Below example is for BigQuery Standard SQL 以下示例适用于BigQuery标准SQL
#standardSQL
WITH `project.dataset.table1` AS (
SELECT 123 id, 'a' test UNION ALL
SELECT 456, 'b' UNION ALL
SELECT 678, 'c'
), `project.dataset.table2` AS (
SELECT 'id123,id456' id UNION ALL
SELECT 'id678'
)
SELECT t2.id, test
FROM `project.dataset.table2` t2, UNNEST(SPLIT(id)) id2
JOIN `project.dataset.table1` t1
ON CONCAT('id', CAST(t1.id AS STRING)) = id2
result is as below 结果如下
Row id test
1 id123,id456 a
2 id123,id456 b
3 id678 c
It is doubtful that you have values in the table that start and end with percentage signs. 值得怀疑的是,表中是否包含以百分号开头和结尾的值。
=
does not recognize wildcards; =
无法识别通配符; like
does: like
这样:
on Table2.ID like concat('%', Table1.ID, '%')
As a warning. 作为警告。 Such a construct is usually a performance killer.
这种构造通常是性能杀手。 You would be better off trying to have columns in
Table1
and Table2
that match exactly. 您最好尝试使
Table1
和Table2
中的列完全匹配。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.