[英]Querying table views in BigQuery using wildcards with _TABLE_SUFFIX
I try to query a large number (~140) different table views within Google BigQuery using _TABLE_SUFFIX .我尝试使用_TABLE_SUFFIX在 Google BigQuery 中查询大量(~140)个不同的表视图。 But this results in the following error massage:但这会导致以下错误消息:
"Views cannot be queried through prefix." “无法通过前缀查询视图。”
Currently I am using this code:目前我正在使用这段代码:
SELECT
tableDate,
`TableA.20*`.ip AS IP,
`TableB.20*`.city AS city,
....
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20*` USING(ip)
WHERE
_TABLE_SUFFIX IN (SELECT table_date FROM `datasetX.dates_table` AS tableDate)
AND
REGEXP_MATCH(cast(s.banner AS string), r'(?i) .....
Structure of "dates_table": “日期表”的结构:
table_date
----------
190305
190312
190319
190326
...
[weekly dates]
The original data set looks like this:原始数据集如下所示:
As I read in the BigQuery documentation , wildcards are only possible to use with legacy SQL and it is not possible to use wildcards to query >views<.正如我在BigQuery 文档中所读到的,通配符只能与旧版 SQL 一起使用,并且不可能使用通配符来查询 >views<。 My simple question is: What could be an alternative way to query data from different views?我的简单问题是:从不同视图查询数据的替代方法是什么? Is there another way to loop though views using wildcards?还有另一种使用通配符循环浏览视图的方法吗?
Possible, but not working solutions:可能但无效的解决方案:
The solutions suggested here are unfortunately not possible in my case.不幸的是, 这里建议的解决方案在我的情况下是不可能的。 I cannot change the data set, as it is a set from a external provider.我无法更改数据集,因为它是来自外部提供商的数据集。 Trying to expose the _TABLE_SUFFIX column, as suggested here , does also not work in my case.按照此处的建议尝试公开 _TABLE_SUFFIX 列,在我的情况下也不起作用。 Using UNION ALL for example, as suggested here , is not possible with 140 tables.例如,如此处所建议的,使用 UNION ALL 对于 140 个表是不可能的。
I would also be very happy to have a solution that uses BigQuery standard SQL so that I can use eg REGEXP_CONTAIN.我也很高兴有一个使用 BigQuery 标准 SQL 的解决方案,这样我就可以使用例如 REGEXP_CONTAIN。
Any ideas?有任何想法吗? That would be great.那太好了。 Thanks a lot.非常感谢。
Frank坦率
Only (suitable) Solution唯一(合适的)解决方案
As there is no way to iterate over table views (for whatever technical reason), it is necessary to hard-define table view names .由于无法迭代表视图(无论出于何种技术原因),因此有必要硬定义表视图名称。 Thus, I suggest to use UNION ALL and hard-code the table view names.因此,我建议使用UNION ALL并对表视图名称进行硬编码。 It means much longer code and no automated/iterative process, but at least it works.这意味着更长的代码并且没有自动化/迭代过程,但至少它是有效的。 ;-) ;-)
SELECT
tableDate,
`TableA.20190305`.ip AS IP,
`TableB.20190305*`.city AS city,
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20190305` USING(ip)
UNION ALL
SELECT
tableDate,
`TableA.20190312`.ip AS IP,
`TableB.20190312`.city AS city,
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20190312` USING(ip)
UNION ALL
<all further tables>
WHERE
REGEXP_MATCH(cast(s.banner AS string), r'(?i) .....
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.