[英]Google BigQuery data transfer - dataset copy
I have a BigQuery dataset in my Google project project-everest:evr_dataset, I want to copy the its table data to my another BigQuery dataset which is sitting in an another project, project-alps:alp_dataset.我的 Google 项目项目 everest:evr_dataset 中有一个 BigQuery 数据集,我想将其表数据复制到另一个项目 project-alps:alp_dataset 中的另一个 BigQuery 数据集。
I attempted to use DTS - data transfer service to schedule the ingest job on a daily basis, but I don't see any option to choose the destination dataset from another project?我尝试使用 DTS - 数据传输服务来安排每天的摄取作业,但我没有看到从另一个项目中选择目标数据集的任何选项? Can anyone enlighten me how enable inter-project DTS?谁能启发我如何启用项目间 DTS?
You can use the below python function to create big query data transfer client and copy datsets from one project to another by specifying source and target project id's.您可以使用以下 python function 创建大查询数据传输客户端,并通过指定源和目标项目 ID 将数据集从一个项目复制到另一个项目。 You can also schedule the data transfer.您还可以安排数据传输。 In the method below it is set to 24 hours(daily).在下面的方法中,它被设置为 24 小时(每天)。
def copy_dataset(override_values={}):
# [START bigquerydatatransfer_copy_dataset]
from google.cloud import bigquery_datatransfer
transfer_client = bigquery_datatransfer.DataTransferServiceClient()
destination_project_id = "my-destination-project"
destination_dataset_id = "my_destination_dataset"
source_project_id = "my-source-project"
source_dataset_id = "my_source_dataset"
# [END bigquerydatatransfer_copy_dataset]
# To facilitate testing, we replace values with alternatives
# provided by the testing harness.
destination_project_id = override_values.get(
"destination_project_id", destination_project_id
)
destination_dataset_id = override_values.get(
"destination_dataset_id", destination_dataset_id
)
source_project_id = override_values.get("source_project_id", source_project_id)
source_dataset_id = override_values.get("source_dataset_id", source_dataset_id)
# [START bigquerydatatransfer_copy_dataset]
transfer_config = bigquery_datatransfer.TransferConfig(
destination_dataset_id=destination_dataset_id,
display_name="Your Dataset Copy Name",
data_source_id="cross_region_copy",
params={
"source_project_id": source_project_id,
"source_dataset_id": source_dataset_id,
},
schedule="every 24 hours",
)
transfer_config = transfer_client.create_transfer_config(
parent=transfer_client.common_project_path(destination_project_id),
transfer_config=transfer_config,
)
print(f"Created transfer config: {transfer_config.name}")
# [END bigquerydatatransfer_copy_dataset]
return transfer_config
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.