简体   繁体   English

使用 BigQuery 在分区表中自动添加新行

[英]Add new rows automatically in Partitioned Table using BigQuery

I have a table in Big Query which is getting daily updated with new rows.我在 Big Query 中有一张表,每天都会更新新行。 I have created a new partitioned table using Partition by Date with date column to reduce execution time and cost.我使用带有日期列的按日期分区创建了一个新的分区表,以减少执行时间和成本。 But, I need to daily automatically update the partitioned table with new data as well.但是,我还需要每天使用新数据自动更新分区表。 How should this be implemented?这应该如何实施? Iam a newbie in Bigquery, thus need help.我是 Bigquery 的新手,因此需要帮助。

You can use the below code to Load data into a column-based time partitioning table.您可以使用以下代码将数据加载到基于列的时间分区表中。

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

# TODO(developer): Set table_id to the ID of the table to create.
# table_id = "your-project.your_dataset.your_table_name"

job_config = bigquery.LoadJobConfig(
    schema=[
        bigquery.SchemaField("name", "STRING"),
        bigquery.SchemaField("post_abbr", "STRING"),
        bigquery.SchemaField("date", "DATE"),
    ],
    skip_leading_rows=1,
    time_partitioning=bigquery.TimePartitioning(
        type_=bigquery.TimePartitioningType.DAY,
        field="date",  # Name of the column to use for partitioning.
        expiration_ms=7776000000,  # 90 days.
    ),
)
uri = "gs://cloud-samples-data/bigquery/us-states/us-states-by-date.csv"

load_job = client.load_table_from_uri(
    uri, table_id, job_config=job_config
)  # Make an API request.

load_job.result()  # Wait for the job to complete.

table = client.get_table(table_id)
print("Loaded {} rows to table {}".format(table.num_rows, table_id))

For more information about the partitioned tables you can refer this document .有关分区表的更多信息,您可以参考此文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM