简体   繁体   English

Java Apache Beam - 使用DataflowRunner保存文件“LOCALY”

[英]Java Apache Beam - save file “LOCALY” by using DataflowRunner

Can send the java code but currently, it's not necessary.

I have an issue as when I run the job as (DirectRunner - using Google VM Instance) it is working fine, as it saves information to the local file and carries on... 我有一个问题,因为当我运行工作(DirectRunner - 使用谷歌虚拟机实例)它工作正常,因为它将信息保存到本地文件并继续...

The problem appears when trying to use (DataflowRunner), and the error which I receive: 尝试使用(DataflowRunner)时出现问题,以及我收到的错误:

java.nio.file.NoSuchFileExtension: XXXX.csv
.....
.....
XXXX.csv could not be delete.

It could be deleted as it not even created. 它可以删除,因为它甚至没有创建。

Problem - how to write the file locally when running through DataflowRunner ?? 问题 - 如何在运行DataflowRunner时在本地写入文件?

PS Using Apache Beam PS使用Apache Beam

Pipeline (part of the code) - Reading from BigQuery and store data to Google storage (Special Character issue) 管道(代码的一部分) - 从BigQuery读取并将数据存储到Google存储(特殊字符问题)

AFAIK when it is ran as a dataflow instance, you have to write file to GCS service (aka storage bucket) rather than local disk. 当AFAIK作为数据流实例运行时,您必须将文件写入GCS服务(也称为存储桶)而不是本地磁盘。

Did you try that already? 你有没试过吗? to create storage bucket: https://cloud.google.com/storage/docs/creating-buckets 创建存储桶: https//cloud.google.com/storage/docs/creating-buckets

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM