简体   繁体   English

将来自云存储的外部临时表与预先存在的大查询表相结合 - append 来自 python

[英]Combining external temp table from cloud storage with pre existing big query table- append from python

I have a permanent table in bigquery that I want to append to with data coming from a csv in google cloud storage.我在 bigquery 中有一个永久表,我想要 append 与来自谷歌云存储中的 csv 的数据。 I first read the csv file into a big query temp table:我首先将 csv 文件读入一个大查询临时表:

table_id = "incremental_custs"
external_config = bigquery.ExternalConfig("CSV")
external_config.source_uris = [
    "gs://location/to/csv/customers_5083983446185_test.csv"
]
external_config.schema=schema
external_config.options.skip_leading_rows = 1
job_config = bigquery.QueryJobConfig(table_definitions={table_id: external_config})
sql_test = "SELECT * FROM `{table_id}`;".format(table_id=table_id)
query_job = bq_client.query(sql_test,job_config=job_config)
customer_updates = query_job.result()
print(customer_updates.total_rows)

Up until here all works and I retrieve the records from the tmp table.直到这里一切正常,我从 tmp 表中检索记录。 Issue arises when I try to then combine it with a permanent table:当我尝试将其与永久表结合时出现问题:

sql = """
create table `{project_id}.{dataset}.{table_new}` as (
      select customer_id, email, accepts_marketing, first_name, last_name,phone,updated_at,orders_count,state,
              total_spent,last_order_name,tags,ll_email,points_approved,points_spent,guest,enrolled_at,ll_updated_at,referral_id,
              referred_by,referral_url,loyalty_tier_membership,insights_segment,rewards_claimed
              from (
                select * from `{project_id}.{dataset}.{old_table}`
                union all 
                select * from `{table_id}`
                ORDER BY customer_id, orders_count  DESC
                ))
                order by orders_count desc 
""".format(project_id=project_id, dataset=dataset_id, table_new=table_new, old_table=old_table, table_id=table_id)
query_job = bq_client.query(sql)
query_result = query_job.result()

I get the following error:我收到以下错误:

BadRequest: 400 Table name "incremental_custs" missing dataset while no default dataset is set in the request.

Am I missing something here?我在这里错过了什么吗? Thanks !谢谢 !

Arf, you forgot the external config! Arf,你忘记了外部配置! You don't pass it in your second script您没有在第二个脚本中传递它

query_job = bq_client.query(sql)

Simply update it like in the first one像第一个一样简单地更新它

query_job = bq_client.query(sql_test,job_config=job_config)

A fresh look is always easier!新鲜的外观总是更容易!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Cloud Function Python 将 CSV 文件从云存储加载到 Big Query 表时出现“客户端”未定义错误 - Getting 'client' not defined error while loading CSV file from cloud storage to Big Query table using Cloud Function Python SQL从临时表添加到现有表 - SQL Add to Existing Table from Temp Table 如何从 SQlAlchemy ORM 会话查询预先存在的表? - How to query pre-existing table from SQlAlchemy ORM session? 通过忽略SCHEMA将Big查询表追加到现有查询表中 - Append Big query table to existing one, by ignoring SCHEMA 使用 Python 中的 Cloud Functions 将 CSV 数据从云存储加载到 Big Query 的最佳逻辑应该是什么 - What should be the best logic to load CSV data from cloud storage to Big Query using Cloud Functions in Python 将 CSV 文件从云存储加载到 Big Query 时出现“禁止:403 超出速率限制:此表的表更新操作过多” - Getting "Forbidden: 403 Exceeded rate limits: too many table update operations for this table" while loading CSV file from cloud storage to Big Query 有条件地从 Python 在 SQL 中创建临时表 - Conditionally Create Temp Table in SQL from Python 如何使用 Python 自动将文件从 Google Cloud Storage 上传到 Big Query? - How to automate file uploads from Google Cloud Storage to Big Query using Python? 如何使用cloud run python api从大查询表中读取大数据以及系统配置应该是什么? - How to read large data from big query table using cloud run python api and what should be system config? csv文件成表-Python编程 - csv file into a table- Python programming
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM