[英]Combining external temp table from cloud storage with pre existing big query table- append from python
I have a permanent table in bigquery that I want to append to with data coming from a csv in google cloud storage.我在 bigquery 中有一个永久表,我想要 append 与来自谷歌云存储中的 csv 的数据。 I first read the csv file into a big query temp table:
我首先将 csv 文件读入一个大查询临时表:
table_id = "incremental_custs"
external_config = bigquery.ExternalConfig("CSV")
external_config.source_uris = [
"gs://location/to/csv/customers_5083983446185_test.csv"
]
external_config.schema=schema
external_config.options.skip_leading_rows = 1
job_config = bigquery.QueryJobConfig(table_definitions={table_id: external_config})
sql_test = "SELECT * FROM `{table_id}`;".format(table_id=table_id)
query_job = bq_client.query(sql_test,job_config=job_config)
customer_updates = query_job.result()
print(customer_updates.total_rows)
Up until here all works and I retrieve the records from the tmp table.直到这里一切正常,我从 tmp 表中检索记录。 Issue arises when I try to then combine it with a permanent table:
当我尝试将其与永久表结合时出现问题:
sql = """
create table `{project_id}.{dataset}.{table_new}` as (
select customer_id, email, accepts_marketing, first_name, last_name,phone,updated_at,orders_count,state,
total_spent,last_order_name,tags,ll_email,points_approved,points_spent,guest,enrolled_at,ll_updated_at,referral_id,
referred_by,referral_url,loyalty_tier_membership,insights_segment,rewards_claimed
from (
select * from `{project_id}.{dataset}.{old_table}`
union all
select * from `{table_id}`
ORDER BY customer_id, orders_count DESC
))
order by orders_count desc
""".format(project_id=project_id, dataset=dataset_id, table_new=table_new, old_table=old_table, table_id=table_id)
query_job = bq_client.query(sql)
query_result = query_job.result()
I get the following error:我收到以下错误:
BadRequest: 400 Table name "incremental_custs" missing dataset while no default dataset is set in the request.
Am I missing something here?我在这里错过了什么吗? Thanks !
谢谢 !
Arf, you forgot the external config! Arf,你忘记了外部配置! You don't pass it in your second script
您没有在第二个脚本中传递它
query_job = bq_client.query(sql)
Simply update it like in the first one像第一个一样简单地更新它
query_job = bq_client.query(sql_test,job_config=job_config)
A fresh look is always easier!新鲜的外观总是更容易!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.