[英]Apache Beam Not Saving Unbounded Data To Text File
I've created a Pipeline to save Google Cloud Pubsub messages into text files using Apache Beam and Java.我创建了一个管道,使用 Apache Beam 和 Java 将 Google Cloud Pubsub 消息保存到文本文件中。 Whenever I run the pipeline within Google Dataflow with
--runner=DataflowRunner
the messages are saved correctly.每当我使用
--runner=DataflowRunner
在 Google Dataflow 中运行管道时,消息都会正确保存。
However, when I run the same pipeline with --runner=DirerctRunner
the messages are not saved.但是,当我使用
--runner=DirerctRunner
运行相同的管道时,不会保存消息。
I can watch the events coming through the pipeline, but nothing happens.我可以看到通过管道发生的事件,但没有任何反应。
The pipeline is the code below:管道是下面的代码:
public static void main(String[] args) {
ExerciseOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().as(ExerciseOptions.class);
Pipeline pipeline = Pipeline.create(options);
pipeline
.apply("Read Messages from Pubsub",
PubsubIO
.readStrings()
.fromTopic(options.getTopicName()))
.apply("Set event timestamp", ParDo.of(new DoFn<String, String>() {
@ProcessElement
public void processElement(ProcessContext context) {
context.outputWithTimestamp(context.element(), Instant.now());
}
}))
.apply("Windowing", Window.into(FixedWindows.of(Duration.standardMinutes(5))))
.apply("Write to File",
TextIO
.write()
.withWindowedWrites()
.withNumShards(1)
.to(options.getOutputPrefix()));
pipeline.run();
}
What I'm doing wrong?我做错了什么? Is it possible to run this pipeline locally?
是否可以在本地运行此管道?
I was facing same problem as yours, while testing pipeline.在测试管道时,我遇到了与您相同的问题。
PubSubIO
not working correctly with DirectRunner
and TextIO
. PubSubIO
无法与DirectRunner
和TextIO
一起正常工作。
I found some kind of workaround for this issue with triggering.我通过触发找到了解决此问题的某种解决方法。
.apply(
"2 minutes window",
Window
.configure()
.triggering(
Repeatedly.forever(
AfterFirst.of(
AfterPane.elementCountAtLeast(10),
AfterProcessingTime
.pastFirstElementInPane()
.plusDelayOf(Duration.standardMinutes(2))
)
)
)
.into(
FixedWindows.of(
Duration.standardMinutes(2)
)
)
)
This way files are written as it should.这样文件就按原样写入。 Hope this will help someone.
希望这会帮助某人。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.