[英]Consume using KafkaIO in batch processing mode using Google Dataflow
[英]Connect to Kafka with SSL using KafkaIO on Google Dataflow
從服務器,我能夠從配置了SSL的遠程kafka服務器主題連接並獲取數據。
從GCP,如何使用Google Dataflow管道通過SSL信任庫,密鑰庫證書位置和Google服務帳戶json連接到遠程kafka服務器?
我正在將Eclipse插件用於數據流運行程序選項。
如果我指向GCS上的證書,則當證書指向Google存儲桶時會引發錯誤。
Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
Caused by: org.apache.kafka.common.KafkaException:
java.io.FileNotFoundException:
gs:/bucket/folder/truststore-client.jks (No such file or directory)
追隨者: Truststore和Google Cloud Dataflow
更新的代碼將SSL信任庫,密鑰庫位置指向本地計算機的/ tmp目錄,以防萬一KafkaIO需要從文件路徑讀取。 它沒有拋出FileNotFoundError。
嘗試從GCP帳戶運行服務器Java客戶端代碼,並且還使用Dataflow-Beam Java管道,出現以下錯誤。
ssl.truststore.location = <LOCAL MACHINE CERTICATE FILE PATH>
ssl.truststore.password = [hidden]
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka version : 1.0.0
org.apache.kafka.common.utils.AppInfoParser$AppInfo <init>
INFO: Kafka commitId : aaa7af6d4a11b29d
org.apache.kafka.common.network.SslTransportLayer close
WARNING: Failed to send SSL Close message
java.io.IOException: Broken pipe
org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:81)
at org.apache.beam.runners.direct.ExecutorServiceParallelExecutor.start(ExecutorServiceParallelExecutor.java:153)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:205)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:66)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
at
org.apache.kafka.common.utils.LogContext$KafkaLogger warn
WARNING: [Consumer clientId=consumer-1, groupId=test-group] Connection to node -2 terminated during authentication. This may indicate that authentication failed due to invalid credentials.
任何建議或示例表示贊賞。
Git從本地計算機克隆Java Maven項目或將其上載到GCP Cloud Shell主目錄。 使用Cloud Shell終端上的DataflowRunner命令編譯項目。
mvn -Pdataflow-runner compile exec:java \
-Dexec.mainClass=com.packagename.JavaClass \
-Dexec.args="--project=PROJECT_ID \
--stagingLocation=gs://BUCKET/PATH/ \
--tempLocation=gs://BUCKET/temp/ \
--output=gs://BUCKET/PATH/output \
--runner=DataflowRunner"
確保運行程序設置為DataflowRunnner.class,並在雲上運行該作業時在Dataflow Console上看到該作業。 DirectRunner執行將不會顯示在雲數據流控制台上。
將證書放在Maven項目內的resources文件夾中,並使用ClassLoader讀取文件。
ClassLoader classLoader = getClass().getClassLoader();
File file = new File(classLoader.getResource("keystore.jks").getFile());
resourcePath.put("keystore.jks",file.getAbsoluteFile().getPath());
如https://stackoverflow.com/a/53549757/4250322中所述,編寫一個ConsumerFactoryFn()以復制數據流的“ / tmp /”目錄中的證書
將KafkaIO與資源路徑屬性一起使用。
Properties props = new Properties();
props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SSL");
props.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG, "/tmp/truststore.jks");
props.put(SslConfigs.SSL_KEYSTORE_LOCATION_CONFIG, "/tmp/keystore.jks");
props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, PASSWORD);
props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, PASSWORD);
props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, PASSWORD);
//other properties
...
PCollection<String> collection = p.apply(KafkaIO.<String, String>read()
.withBootstrapServers(BOOTSTRAP_SERVERS)
.withTopic(TOPIC)
.withKeyDeserializer(StringDeserializer.class)
.withValueDeserializer(StringDeserializer.class)
.updateConsumerProperties(props)
.withConsumerFactoryFn(new ConsumerFactoryFn())
.withMaxNumRecords(50)
.withoutMetadata()
).apply(Values.<String>create());
// Apply Beam transformations and write to output.
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.