简体   繁体   English

BigQuery循环从另一个表中注册的动态table_names中选择值

[英]BigQuery loop to select values from dynamic table_names registered in another table

I'm looking for a solution to extract data from multiple tables and insert it into another table automatically running a single script. 我正在寻找一种从多个表中提取数据并将其插入自动运行单个脚本的另一个表中的解决方案。 I need to query many tables, so I want to make a loop to select from those table's names dynamically. 我需要查询许多表,所以我想做一个循环以动态地从那些表的名称中进行选择。

I wonder if I could have a table with table names, and execute a loop like: 我想知道是否可以有一个带有表名的表,并执行如下循环:

foreach(i in table_names)
    insert into aggregated_table select * from table_names[i]
end

Below is for BigQuery Standard SQL 以下是BigQuery标准SQL

#standardSQL
SELECT * FROM `project.dataset1.*`
WHERE _TABLE_SUFFIX IN (SELECT table_name FROM `project.dataset2.list`)

This approach will work if below conditions are met 如果满足以下条件,此方法将起作用

  • all to be processed table from list have exact same schema 列表中所有要处理的表都具有完全相同的架构
  • one of those tables is the most recent table - this table will define schema that will be used for all the rest tables in the list 这些表之一是最新的表-该表将定义将用于列表中所有其余表的架构
  • to meet above bullet - ideally list should be hosted in another dataset 满足上述要求-理想情况下,列表应托管在另一个数据集中

Obviously, you can add INSERT INTO ... to insert result into whatever destination is to be 显然,您可以添加INSERT INTO ...以将结果插入到任何目标位置

Please note: Filters on _TABLE_SUFFIX that include subqueries cannot be used to limit the number of tables scanned for a wildcard table, so make sure your are using longest possible prefix - for example 请注意:_TABLE_SUFFIX上包含子查询的筛选器不能用于限制为通配符表扫描的表数,因此请确保您使用的是最长的前缀-例如

#standardSQL
SELECT * FROM `project.dataset1.source_table_*`
WHERE _TABLE_SUFFIX IN (SELECT table_name FROM `project.dataset2.list`)   

So, again - even though you will select data from specific tables (set in project.dataset2.list ) the cost will be for scanning all tables that match project.dataset1.source_table_* woldcard 因此,即使您从特定表(在project.dataset2.list设置)中选择数据,再次花费的代价是扫描所有与project.dataset1.source_table_* woldcard匹配的表

While above is purely in BigQuery SQL - you can use any client of your choice to script exacly the logic you need - read table names from list table and then select and insert in loop - this option is simplest and most optimal I think 尽管以上内容完全是在BigQuery SQL中进行的-您可以使用任意选择的客户端来编写所需的逻辑脚本-从列表中读取表名称,然后选择并循环插入-我认为该选项最简单,最优化

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何 select 列名来自 SQL 中另一个表的值? - How to select values where column names are from another table in SQL? 如何将另一列的值赋予表函数的参数? (或者,具有多个 table_names 的 Snowflake copy_history) - How to give another column's value to a table function's argument? (or, Snowflake copy_history with multiple table_names) Oracle DB sql 将 table_names 模式分组为一组 - Oracle DB sql group the table_names patterns into a group 根据另一个表的列中的值从一个表中选择列名 - Select column names from one table based off of values in column of another table MySQL从表中选择动态行值作为列名,从另一个表中选择值 - Mysql select dynamic row values as column name from a table and value from another table 根据另一个表中的列名从一个表中选择列 - Select columns from a table based on the column names from another table Bigquery SELECT * FROM表,其中column ='string'不返回任何值 - Bigquery SELECT * FROM table where column = 'string' not returning any values SQL查询以汇总一个表上的值并从另一表中选择名称 - SQL Query to Sum Values on One Table and Select Names from Another table 如何对 oracle 中的多个表执行动态 select 查询? 使用表名和列名作为其他表的值? - How to perform a dynamic select query for multiple tables in oracle? using table name and column names as values from other table? select如何将某行的值作为另一个表的列名? - How select a row values to be column names of another table?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM