将来自云存储的外部临时表与预先存在的大查询表相结合 - append 来自 python

Question

I have a permanent table in bigquery that I want to append to with data coming from a csv in google cloud storage.我在 bigquery 中有一个永久表，我想要 append 与来自谷歌云存储中的 csv 的数据。 I first read the csv file into a big query temp table:我首先将 csv 文件读入一个大查询临时表：

table_id = "incremental_custs"
external_config = bigquery.ExternalConfig("CSV")
external_config.source_uris = [
    "gs://location/to/csv/customers_5083983446185_test.csv"
]
external_config.schema=schema
external_config.options.skip_leading_rows = 1
job_config = bigquery.QueryJobConfig(table_definitions={table_id: external_config})
sql_test = "SELECT * FROM `{table_id}`;".format(table_id=table_id)
query_job = bq_client.query(sql_test,job_config=job_config)
customer_updates = query_job.result()
print(customer_updates.total_rows)

Up until here all works and I retrieve the records from the tmp table.直到这里一切正常，我从 tmp 表中检索记录。 Issue arises when I try to then combine it with a permanent table:当我尝试将其与永久表结合时出现问题：

sql = """
create table `{project_id}.{dataset}.{table_new}` as (
      select customer_id, email, accepts_marketing, first_name, last_name,phone,updated_at,orders_count,state,
              total_spent,last_order_name,tags,ll_email,points_approved,points_spent,guest,enrolled_at,ll_updated_at,referral_id,
              referred_by,referral_url,loyalty_tier_membership,insights_segment,rewards_claimed
              from (
                select * from `{project_id}.{dataset}.{old_table}`
                union all 
                select * from `{table_id}`
                ORDER BY customer_id, orders_count  DESC
                ))
                order by orders_count desc 
""".format(project_id=project_id, dataset=dataset_id, table_new=table_new, old_table=old_table, table_id=table_id)
query_job = bq_client.query(sql)
query_result = query_job.result()

I get the following error:我收到以下错误：

BadRequest: 400 Table name "incremental_custs" missing dataset while no default dataset is set in the request.

Am I missing something here?我在这里错过了什么吗？ Thanks !谢谢！

Answer 1

Arf, you forgot the external config! Arf，你忘记了外部配置！ You don't pass it in your second script您没有在第二个脚本中传递它

query_job = bq_client.query(sql)

Simply update it like in the first one像第一个一样简单地更新它

query_job = bq_client.query(sql_test,job_config=job_config)

A fresh look is always easier!新鲜的外观总是更容易！

将来自云存储的外部临时表与预先存在的大查询表相结合 - append 来自 python

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-03-16 20:02:23

将来自云存储的外部临时表与预先存在的大查询表相结合 - append 来自 python

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-03-16 20:02:23

解决方案1
2 已采纳 2021-03-16 20:02:23