简体   繁体   English

BigQuery 创建临时表

[英]BigQuery create Temp table

I am trying to create a temporary table with a specific name so i can use it in other queries.我正在尝试创建一个具有特定名称的临时表,以便我可以在其他查询中使用它。 The problem is that when i create a temp table like this:问题是当我创建这样的临时表时:

BEGIN

  BEGIN TRANSACTION;
   --delete test_temp where 1=1;
   create temp table test_temp as SELECT * FROM `analytics.reports.analysis` ;

  COMMIT TRANSACTION;

END;

the table name is indeed test_temp but if i will try to select * from test_temp i get nothing.表名确实是 test_temp 但如果我尝试select * from test_temp我什么也得不到。 the table id always generates a new unique id.表 ID 总是生成一个新的唯一 ID。 something like analytics-prod:_script3991beb3e9868774854ac09d407f1c397428a4f6.test_tempanalytics-prod:_script3991beb3e9868774854ac09d407f1c397428a4f6.test_temp

Is there a way to make this temp table with a pre-defined name?有没有办法用预定义的名称制作这个临时表?

What I am trying to achieve basically is create a table from a select every 15min.我想要实现的基本上是每 15 分钟从 select 创建一个表。 I thought to create a scheduled query running this transaction that will delete + populate the temp table and then I can always reference to test_temp without changing the table id.我想创建一个运行此事务的计划查询,该查询将删除并填充临时表,然后我始终可以在不更改表 ID 的情况下引用test_temp

According to the docs :根据文档

You can refer to a temporary table by name for the duration of the current multi-statement query您可以在当前多语句查询期间按名称引用临时表

So if you want to refer the table in other queries, an option could be use a permanent table in a separated dataset:因此,如果您想在其他查询中引用该表,可以选择在单独的数据集中使用永久表:

...
CREATE OR REPLACE TABLE `analytics.my_temp_dataset.test_temp` AS SELECT * FROM `analytics.reports.analysis` ;
...

In that case, you don't need a temp table but a permanent table you just replace on the next run using the CREATE OR REPLACE TABLE statement.在这种情况下,您不需要临时表,而是需要在下次运行时使用CREATE OR REPLACE TABLE语句替换的永久表。 If you are using the temp table as part of a scripting stage, then I suggest using running this instead:如果您使用临时表作为脚本阶段的一部分,那么我建议使用运行它来代替:

BEGIN
      
  CREATE OR REPLACE TEMP TABLE _SESSION.test_temp AS
  SELECT * FROM `analytics.reports.analysis`;

  -- SELECT COUNT(*) CNT FROM _SESSION.test_temp;

END;

Since it is a temp table you don't need a DROP query at the end.因为它是一个临时表,所以最后不需要DROP查询。

You can create a temporary table in a session in BigQuery.您可以在 BigQuery 的session中创建一个临时表。

You can create a session via the API and then create your temporary tables in the session. The tables will then be visible to all queries using that session.您可以通过 API 创建一个 session,然后在 session 中创建您的临时表。这些表将对使用该 session 的所有查询可见。

There are limitations on sessions you should be aware of:您应该注意的会话限制:

  • They last up to 24 hours max它们最长可持续使用 24 小时
  • No concurrent jobs running in a session session 中没有并发作业在运行

I use temporary tables with sessions to do complex transformations in an ETL pipeline.我使用带有会话的临时表在 ETL 管道中进行复杂的转换。 For example:例如:

  1. Create session创建 session
  2. Create temporary tables创建临时表
  3. Do joins, and other transformations, write results to temporary tables进行连接和其他转换,将结果写入临时表
  4. Do final transformations and output results to permanent tables对永久表进行最终转换和 output 结果
  5. End session结束 session

I am writing a Python API for this and will put it on github at some point.我正在为此写一个 Python API 并将在某个时候将其放在 github 上。 I could probably post a gist if that is helpful.如果有帮助,我可能会发布要点。

In the meantime, see Google's documentation for sessions in BigQuery:同时,请参阅 Google 的 BigQuery 会话文档:

https://cloud.google.com/bigquery/docs/sessions-intro https://cloud.google.com/bigquery/docs/sessions-intro

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM