简体   繁体   English

有什么方法可以将 BigTable 数据加载到 BigQuery 中吗?

[英]Is there any way we can load BigTable data into BigQuery?

I want to load BigTable data into BigQuery with direct way.我想直接将 BigTable 数据加载到 BigQuery 中。

Till now I am loading BigTable data into CSV file using Python and then loading csv file into BigQuery.到目前为止,我正在使用 Python 将 BigTable 数据加载到 CSV 文件中,然后将 csv 文件加载到 BigQuery 中。

But I don't want to use csv file in between BigTable and BigQuery is there any direct way?但是我不想在 BigTable 和 BigQuery 之间使用 csv 文件有什么直接的方法吗?

To add to Mikhail's recommendation, I'd suggest creating a permanent table in BigQuery using the external table.要添加 Mikhail 的建议,我建议使用外部表在 BigQuery 中创建一个永久表。 You'll define the schema for the columns you want and then query the rows you're interested in. Once that data is saved into BigQuery, it won't have any impact on your Bigtable performance.您将为所需的列定义架构,然后查询您感兴趣的行。将数据保存到 BigQuery 后,它不会对您的 Bigtable 性能产生任何影响。 If you want to get the latest data, you can create a new permanent table with the same query.如果要获取最新数据,可以使用相同的查询创建一个新的永久表。

If you're looking to have the data copied over and stored in BigQuery, Querying Cloud Bigtable data using permanent external tables is not what you're looking for.如果您希望将数据复制并存储在 BigQuery 中,那么使用永久外部表查询 Cloud Bigtable 数据并不是您想要的。 It explicitly mentions that "The data is not stored in the BigQuery table".它明确提到“数据未存储在 BigQuery 表中”。 My understanding is that the permanent table is more for persistent access controls, but still queries Bigtable directly.我的理解是永久表更多的是为了持久化的访问控制,但是还是直接查询Bigtable。

This may be overkill, but you could set up and Apache Beam pipeline that runs in Dataflow , has a BigQueryIO source , and a BigTableIO sink .这可能有点矫枉过正,但您可以设置 Apache Beam 管道,该管道在Dataflow中运行,具有BigQueryIO 源BigTableIO 接收器 You'd have to write a little bit of transformation logic, but overall it should be a pretty simple pipeline.您必须编写一些转换逻辑,但总的来说它应该是一个非常简单的管道。 The only catch here is that the BigTableIO connector is only for the Beam Java SDK , so you'd have to write this pipeline in Java.这里唯一的问题是 BigTableIO 连接器仅适用于 Beam Java SDK ,因此您必须在 Java 中编写此管道。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM