[英]Read data from BigQuery and write it into avro file format on cloud storage
My objective is to read the data from BigQuery table and write it to Avro file on cloud storage using Java. 我的目标是从BigQuery表中读取数据,然后使用Java将其写入云存储中的Avro文件中。 It would be good if some one provide the code snipet/ideas to write BigQuery table data and write it to avro format data using Cloud Dataflow. 如果有人提供代码snipet / ideas来编写BigQuery表数据并将其使用Cloud Dataflow写入avro格式数据,那将是很好的。
It is possible to export data from BigQuery to GCS as Avro format as a one-time export, this can be done through the Client Libraries, including Java. 可以一次性将数据从BigQuery导出为Avro格式的BigQuery到GCS,这可以通过包括Java在内的客户端库来完成。 Here are some snippets (the full example can be found in GitHub), and for java you can code: 以下是一些代码片段(完整的示例可以在GitHub上找到),对于Java,您可以编写代码:
Job job = table.extract(format, gcsUrl);
// Wait for the job to complete
try {
Job completedJob =
job.waitFor(
RetryOption.initialRetryDelay(Duration.ofSeconds(1)),
RetryOption.totalTimeout(Duration.ofMinutes(3)));
if (completedJob != null && completedJob.getStatus().getError() == null) {
// Job completed successfully
} else {
// Handle error case
}
} catch (InterruptedException e) {
// Handle interrupted wait
}
The format variable can be CSV, JSON or AVRO and the gcsUtl variable should contain bucket and path to the file, eg gs://my_bucket/filename 格式变量可以是CSV,JSON或AVRO,并且gcsUtl变量应包含存储桶和文件路径,例如gs:// my_bucket / filename
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.