简体   繁体   English

从BigQuery读取数据并将其写入云存储中的avro文件格式

[英]Read data from BigQuery and write it into avro file format on cloud storage

My objective is to read the data from BigQuery table and write it to Avro file on cloud storage using Java. 我的目标是从BigQuery表中读取数据,然后使用Java将其写入云存储中的Avro文件中。 It would be good if some one provide the code snipet/ideas to write BigQuery table data and write it to avro format data using Cloud Dataflow. 如果有人提供代码snipet / ideas来编写BigQuery表数据并将其使用Cloud Dataflow写入avro格式数据,那将是很好的。

It is possible to export data from BigQuery to GCS as Avro format as a one-time export, this can be done through the Client Libraries, including Java. 可以一次性将数据从BigQuery导出为Avro格式的BigQuery到GCS,这可以通过包括Java在内的客户端库来完成。 Here are some snippets (the full example can be found in GitHub), and for java you can code: 以下是一些代码片段(完整的示例可以在GitHub上找到),对于Java,您可以编写代码:

Job job = table.extract(format, gcsUrl);
// Wait for the job to complete
try {
  Job completedJob =
      job.waitFor(
          RetryOption.initialRetryDelay(Duration.ofSeconds(1)),
          RetryOption.totalTimeout(Duration.ofMinutes(3)));
  if (completedJob != null && completedJob.getStatus().getError() == null) {
    // Job completed successfully
  } else {
    // Handle error case
  }
} catch (InterruptedException e) {
  // Handle interrupted wait
}

The format variable can be CSV, JSON or AVRO and the gcsUtl variable should contain bucket and path to the file, eg gs://my_bucket/filename 格式变量可以是CSV,JSON或AVRO,并且gcsUtl变量应包含存储桶和文件路径,例如gs:// my_bucket / filename

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 GCP Dataflow-从存储读取CSV文件并写入BigQuery - GCP Dataflow- read CSV file from Storage and write into BigQuery 谷歌数据流作业根据文件模式从云存储中读取 avro 文件 - Google dataflow job to read avro files from Cloud storage based on file patterns 如何使用Java将数据从Cloud Storage加载到BigQuery - How to load data from Cloud Storage into BigQuery using Java 使用 Clud Dataflow 将数据从 Google Cloud Sql 读取到 BigQuery - Read the data from Google Cloud Sql to BigQuery using Clud Dataflow 将嵌套的BigQuery数据导出到云存储 - Export nested BigQuery data to cloud storage GCP 存储 - 从 mongo db 读取数据并将其写入文件 GCP 存储桶 - GCP storage - read data from mongo db and write it into to file GCP bucket 如何将“Data.Json”文件从资产保存到内部存储,然后将其用于读/写 - How to Save "Data.Json" file from assets to internal Storage and then use it for read/write 在java中从谷歌云存储读取/下载文件的一部分 - Read/download part of file from google cloud storage in java 如何从 Java 中的 Google Cloud Storage 读取文件 - How to read a file from Google Cloud Storage in Java 从Android应用程序中的Google Cloud Storage中读取文件 - Read File from Google Cloud Storage within Android App
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM