[英]How to write to BigTable using Apache Beam direct-runner in java?
I have been trying to get Apache Beam direct runner to write to BigTable but it seems like there is a problem.我一直在尝试让 Apache Beam direct runner 写入 BigTable,但似乎有问题。
There is no failure or confirmation errors on the terminal when I run gradle run
.当我运行
gradle run
时,终端上没有失败或确认错误。
My pipeline is as follows:我的管道如下:
Pub/Sub stream of messages -> direct-runner -> BigTable
Currently using org.apache.beam.sdk.io.gcp.bigtable.BigtableIO
adapter which is not working or I am doing something wrong.当前使用
org.apache.beam.sdk.io.gcp.bigtable.BigtableIO
适配器不工作或我做错了什么。
There is also this another I/O adapter com.google.cloud.bigtable.beam.CloudBigtableIO
and I am not sure which one to choose.还有另一个 I/O 适配器
com.google.cloud.bigtable.beam.CloudBigtableIO
,我不确定选择哪个。
Some questions:一些问题:
System.out.println
statements.System.out.println
语句,很难查看管道。$GOOGLE_APPLICATION_CREDENTIALS
env variable & use those credentials? $GOOGLE_APPLICATION_CREDENTIALS
环境变量并使用这些凭据? Will be happy to give more details.很乐意提供更多详细信息。
EDIT:编辑:
To verify what is going on:要验证发生了什么:
1. add in main()
BasicConfigurator.configure();
1. 在
main()
中添加BasicConfigurator.configure();
2. add in pom.xml
2. 添加
pom.xml
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.32</version>
</dependency>
3. add this log4j.properities 3.添加这个log4j.properities
# Set root logger level to DEBUG and its only appender to A1.
log4j.rootLogger=DEBUG, A1
# A1 is set to be a ConsoleAppender.
log4j.appender.A1=org.apache.log4j.ConsoleAppender
# A1 uses PatternLayout.
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
To write on bigtable with direct runner 1. in pom.xml
:使用直接运行器 1. 在
pom.xml
中写入 bigtable:
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-direct-java</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
2. the interface for the pipeline option, to configure with run config commands: 2. pipeline选项的接口,使用run config命令进行配置:
public interface RequestsOptions extends PipelineOptions {
@Description("File path")
@Validation.Required
String getInput();
void setInput(String value);
@Description("Output")
@Validation.Required
String getOutput();
void setOutput(String value);
}
3. in run config commands: 3.在运行配置命令:
--project=PROJECT_ID
--dataset=DATASET_NAME
--inputFile=INPUT_FILE_NAME
--region=REGION_TO_RUN //if dataflow runner
--runner=YOUR_SELECTED_RUNNER
--tempLocation=GOOGLE_STORAGE_LOCATION(to save temp files)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.