简体   繁体   English

Java Apache Beam 测试管道用空值替换测试数据

[英]Java Apache Beam Testing pipeline replaces test data with null values

I have created a test pipeline like this:我创建了一个这样的测试管道:

        Pipeline pipeline;
        PipelineOptions pipelineOptions = TestPipeline.testingPipelineOptions();
        pipeline = Pipeline.create(pipelineOptions);

        FlattenLight flattenLight = new FlattenLight();
        DataflowMessage dataflowMessage = getTestDataflowMessage();
        
        PCollection<TableRow> flatttened = pipeline
                .apply("Create Input", Create.of(dataflowMessage))
                .apply(ParDo.of(flattenLight));

I want tot test the FlattenLight class, it is a DoFn child with a processElement(ProcessContext c) method.我想测试 FlattenLight 类,它是一个带有processElement(ProcessContext c)方法的 DoFn 子类。

The problem is that the test data generated with getTestDataflowMessage() does not goes through the pipeline.问题是使用getTestDataflowMessage()生成的测试数据没有通过管道。 The FlattenLight object receives an Object with null values as fields. FlattenLight 对象接收一个带有空值作为字段的对象。

The getTestDataflowMessage() creates fields as expected. getTestDataflowMessage() 按预期创建字段。 You can see a lot of different test values are present:您可以看到存在许多不同的测试值:

debugger step at test data creation测试数据创建时的调试器步骤

But the FlattenLight class receives an Object that is mostly empty:但是 FlattenLight 类接收到一个大部分为空的对象:

debugger step entering the FlattenLight object进入 FlattenLight 对象的调试器步骤

As you can see there is not step between the data creation and the FlattenLight processing.如您所见,数据创建和 FlattenLight 处理之间没有步骤。 Why does this happens?为什么会发生这种情况? How to fix it?如何解决?

I had the same issue.我遇到过同样的问题。 The solution was to add implements Serializable to all models within the model hierarchy.解决方案是向模型层次结构中的所有模型添加可implements Serializable Take a closer look at your DataflowMessage, maybe you missed it somewhere.仔细查看您的 DataflowMessage,也许您在某处错过了它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM