简体   繁体   English

使用 Google Cloud Functions 在两个 BigQuery 项目之间传输数据

[英]Data Transfer between two BigQuery Projects using Google Cloud Functions

I have created two projects on Google Cloud Platform say project1 and project2.我在 Google Cloud Platform 上创建了两个项目,分别是 project1 和 project2。 Project1 has a bigquery dataset named dataset1 which contains a table named table1 which has some contents. Project1 有一个名为 dataset1 的 bigquery 数据集,其中包含一个名为 table1 的表,其中包含一些内容。 Project2 has a bigquery dataset named dataset2 which contains a table named table2 which is empty. Project2 有一个名为 dataset2 的 bigquery 数据集,其中包含一个名为 table2 的表,该表为空。 I need a python code that will copy/import the table1 and export/copy it to table2 which was initially empty using Google Cloud Functions tool.我需要一个 python 代码,它将使用 Google Cloud Functions 工具复制/导入 table1 并将其导出/复制到最初为空的 table2。

  1. Understand how to use Python to send a query to BigQuery following the documentation .了解如何按照文档使用 Python 向 BigQuery 发送查询。

  2. The query to "copy/import the table1 and export/copy it to table2" you will need is (assuming table2 has exactly same schema as table1):您需要的“复制/导入 table1 并将其导出/复制到 table2”的查询是(假设 table2 与 table1 具有完全相同的架构):

INSERT INTO project2.dataset2.table2 
SELECT * FROM project1.dataset1.table1;

Find Python code to copy a table here:在此处找到 Python 代码复制表格:

The code is:代码是:

# from google.cloud import bigquery
# client = bigquery.Client()

source_dataset = client.dataset("samples", project="bigquery-public-data")
source_table_ref = source_dataset.table("shakespeare")

# dataset_id = 'my_dataset'
dest_table_ref = client.dataset(dataset_id).table("destination_table")

job = client.copy_table(
    source_table_ref,
    dest_table_ref,
    # Location must match that of the source and destination tables.
    location="US",
)  # API request

job.result()  # Waits for job to complete.

assert job.state == "DONE"
dest_table = client.get_table(dest_table_ref)  # API request
assert dest_table.num_rows > 0

There's another answer to that question that shows you can do it with INSERT INTO * , but that operation will have the cost of a full table scan - vs free with this one.该问题的另一个答案表明您可以使用INSERT INTO *来执行此操作,但是该操作将产生全表扫描的成本 - 而这个操作是免费的。

(I normally use CREATE TABLE or INSERT INTO because they are more convenient tho) (我通常使用CREATE TABLEINSERT INTO因为它们更方便)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Cloud Functions 和 Cloud Scheduler 创建到 BigQuery 的数据管道 - Creating a Data Pipeline to BigQuery Using Cloud Functions and Cloud Scheduler 获取禁止:403 访问被拒绝,当请求使用 python 将数据从谷歌云存储传输到 bigquery 时 - get Forbidden: 403 Access Denied when do request to transfer data from google cloud storage to bigquery using python 在 python 中使用谷歌数据流进行 Bigquery 到 Bigtable 的数据传输 - Bigquery to Bigtable data transfer using google dataflow in python Google GCP Cloud Functions 到 BigQuery 错误 - Google GCP Cloud Functions to BigQuery Error 不使用 Google Cloud Storage 将 BigQuery 数据导出为 CSV - Export BigQuery Data to CSV without using Google Cloud Storage Python API 使用云函数调用 BigQuery - Python API call to BigQuery using cloud functions 使用时间戳 id 在两个表之间查找和传输新数据 - Look for and transfer new data between two tables using a timestamp id 如何在 Python 中连接到 Google BigQuery 中的两个不同项目? - How can I connect to two different projects in Google BigQuery in Python? 如何使用 Google Cloud Functions 从 JIRA 转发 webhook 数据? - How to forward webhook data from JIRA using Google Cloud Functions? 如何将Appengine与来自API的Python脚本流数据一起使用,将数据流传输到Google Cloud BigQuery? - How to stream data into Google Cloud BigQuery using Appengine with Python Script-flowing data from API?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM