简体   繁体   English

使用python将bigquery分区替换为在bigquery表中暂存的数据

[英]replace bigquery partition with data staged in bigquery table using python

I have 30 days of data staged in a daily partitioned table in bigquery. 我在bigquery的每日分区表中暂存了30天的数据。 I have a larger table with 5 years of data partitioned daily. 我有一个更大的表,每天有5年的数据分区。 I need to select from the stage table and replace the entire contents of the existing partitions in the larger table for the 30 days that are in my staging table. 我需要从暂存表中进行选择,并在暂存表中的30天之内替换较大表中现有分区的全部内容。 My preference is to do this using Python and not extracting the data to a csv first and then loading it back to BQ if I can avoid that. 我的偏好是使用Python进行此操作,并且如果可以避免的话,请不要先将数据提取到csv中,然后再将其加载回BQ中。 Any suggestions? 有什么建议么? Thanks in advance. 提前致谢。

All you need to do is query what you need and set destination table for your query. 您需要做的就是查询您需要的内容并为查询设置目标表。

from google.cloud import bigquery
client = bigquery.Client()
query = """\
SELECT firstname + ' ' + last_name AS full_name,
       FLOOR(DATEDIFF(CURRENT_DATE(), birth_date) / 365) AS age
 FROM dataset_name.persons
"""
dataset = client.dataset('dataset_name')
table = dataset.table(name='person_ages')
job = client.run_async_query('fullname-age-query-job', query)
job.destination = table
job.write_disposition= 'truncate'
job.begin()

That did not actually work for me but I do think it is correct, albeit for an older version of the big query client library. 这实际上对我不起作用,但是我认为这是正确的,尽管适用于旧版本的大查询客户端库。 Your answer did help tremendously and I will accept. 您的回答确实有很大帮助,我会接受。 I am using most up to date library. 我正在使用最新的图书馆。 The following worked for me: 以下为我工作:

for partition in gbq.list_partitions(stage_table_ref):
    table_partition = table_name+'$'+partition
    stage_partition = stage_dataset.table(table_partition)
    target_partition = target_dataset.table(table_partition)
    job_config = bigquery.CopyJobConfig()
    job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE   
    gbq.copy_table(stage_partition, target_partition,job_config = job_config) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM