简体   繁体   English

Airflow:BigQuery SQL 向表中插入空数据

[英]Airflow: BigQuery SQL Insert empty data to the table

Using Airflow, I am trying to get the data from one table to insert it into another in BigQuery.使用 Airflow,我试图从一个表中获取数据以将其插入到 BigQuery 中的另一个表中。 I have 5 origin tables and 5 destination tables.我有 5 个源表和 5 个目标表。 My SQL query and python logic work for the 4 tables where it successfully gets the data and inserts it into their respective destination tables, but it doesn't work for 1 table.我的 SQL 查询和 python 逻辑适用于 4 个表,它成功获取数据并将其插入到各自的目标表中,但不适用于 1 个表。

query = '''SELECT * EXCEPT(eventdate) FROM `gcp_project.gcp_dataset.gcp_table_1`
    WHERE id = "1234"
    AND eventdate = "2023-01-18"
'''

# Delete the previous destination tables if existed
bigquery_client.delete_table("gcp_project.gcp_dataset.dest_gcp_table_1", not_found_ok=True)

job_config = bigquery.QueryJobConfig()

table_ref = bigquery_client.dataset(gcp_dataset).table(dest_gcp_table_1)
job_config.destination = table_ref
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TURNCATE

# Start the query, passing in the extra configuration.
query_job = bigquery_client.query(query=query,
      location='US',
      job_config=job_config
     )

#check if the table is successfully written
while not query_job.done():
     time.sleep(1)
logging.info("Data is written into a destination table with {} number of rows for id {}."
             .format(query_job.result().total_rows, id))

I have even tried using the SQL query with CREATE OR REPLACE but the result was still the same table_1 is coming as empty.我什至尝试将 SQL 查询与CREATE OR REPLACE一起使用,但结果仍然是相同的 table_1 为空。 I have also tried BigQueryInsertJobOperator , but table_1 still comes empty.我也尝试过BigQueryInsertJobOperator ,但 table_1 仍然是空的。

  • Note: Size of the Table_1 data is around 270 MB with 1463306 rows, it is also the biggest out of all the tables data when it comes to inserting it into another table注意: Table_1 数据的大小约为 270 MB,有 1463306 行,在将其插入另一个表时,它也是所有表数据中最大的

I tried to execute the above logic from my local machine and it works fine for table_1 as well, I see the data in GCP BigQuery.我尝试从我的本地机器执行上述逻辑,它也适用于 table_1,我在 GCP BigQuery 中看到了数据。

I am not sure why and what's happening behind this.我不确定为什么以及这背后发生了什么。 Does anyone have any idea why this happening or what can it cause?有谁知道为什么会发生这种情况或它会导致什么?

Found the root cause for this, the previous query which is responsible for populating the origin table was still running in the GCP BigQuery backend.找到了根本原因,之前负责填充源表的查询仍在 GCP BigQuery 后端运行。 Because of that the above query did get any data.因此,上述查询确实获得了任何数据。

Solution: introduced query_job.result() This will wait for the job to be complete and then execute the next query.解决方案:引入query_job.result() ,这个会等待job完成,然后执行下一个query。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Airflow DAG:如何使用 Python 运算符而不是 BigQuery 运算符将数据插入表中? - Airflow DAG: How to insert data into a table using Python operator, not BigQuery operator? Airflow:如何获取数据插入BigQuery表的当前日期? - Airflow: How to get the current date of when data is inserted into a BigQuery table? BigQuery - 使用查询将数据插入分区表 - BigQuery - Insert data into a partitioned table using a query 调整数据插入到 BigQuery - Adjust Data insert to BigQuery Airflow BigQuery Operator - 将一张表复制到另一张表 - Airflow BigQuery Operator - Copying One Table to Another Table 我可以在 BigQuery 中使用 CALL 创建表或从存储过程结果插入数据吗? - Can i create table or insert data from stored procedure results use CALL in BigQuery? Golang BigQuery:将数组插入表中 - Golang BigQuery: Insert array into table 如何将Amazon AWS S3 Bucket中的数据复制到Apache Airflow MariaDB SQL表中 - How to copy data from Amazon AWS S3 Bucket into a MariaDB SQL table in Apache Airflow Airflow 2:将数据从 BigQuery 传输到 Cloud Storage 时找不到作业 - Airflow 2: Job Not Found when transferring data from BigQuery into Cloud Storage Airflow:如何将数据从 REST API 加载到 BigQuery? - Airflow: How to load data from a REST API to BigQuery?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM