[英]Apache Beam Streaming from Pub/Sub to ElasticSearch
I'm writing a java streaming pipeline with Apache Beam that reads messages from Google Cloud PubSub and should write them into an ElasticSearch instance.我正在使用 Apache Beam 编写 java 流管道,它从 Google Cloud PubSub 读取消息并将它们写入 ElasticSearch 实例。 Currently, I'm using the direct runner, but the plan is to deploy the solution on Google Cloud Dataflow.
目前,我正在使用直接运行器,但计划是在 Google Cloud Dataflow 上部署解决方案。
First of all, I wrote a pipeline that reads from PubSub and writes to text files and it works.首先,我编写了一个从 PubSub 读取并写入文本文件的管道,它可以工作。 Then, I sat up the ElasticSearch instance and also this works.
然后,我坐了 ElasticSearch 实例,这也有效。 I wrote some documents with curl and it was easy.
我用 curl 写了一些文档,这很容易。
Then, when I tried to perform the write with Beam's ElasticSearch connector, I started to get some error.然后,当我尝试使用 Beam 的 ElasticSearch 连接器执行写入时,我开始遇到一些错误。 Actually, I get
ava.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest
, in spite of the fact that I added the dependency on my pom.xml file.实际上,我得到
ava.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest
,尽管我添加了对我的 pom.xml 文件的依赖。
What I'm doing is essentially this:我正在做的基本上是这样的:
messages.apply(
"TwoMinWindow",
Window.into(FixedWindows.of(new Duration(120*1000)))
).apply(
"ElasticWrite",
ElasticsearchIO.write()
.withConnectionConfiguration(
ElasticsearchIO.ConnectionConfiguration
.create(new String[]{"http://xxx.xxx.xxx.xxx:9200"}, "streaming_data", "string")
.withUsername("xxxx")
.withPassword("xxxxxxxx")
)
);
Using the DirectRunner, I'm able to connect to PubSub, but I get an exception when the pipeline tries to connect with the ElasticSearch instance:使用 DirectRunner,我可以连接到 PubSub,但是当管道尝试连接 ElasticSearch 实例时出现异常:
java.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest(Ljava/lang/String;Ljava/lang/String;[Lorg/apache/http/Header;)Lorg/elasticsearch/client/Response;
at org.apache.beam.sdk.util.UserCodeException.wrap (UserCodeException.java:34)
at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$Write$WriteFn$DoFnInvoker.invokeSetup (Unknown Source)
at org.apache.beam.sdk.transforms.reflect.DoFnInvokers.tryInvokeSetupFor (DoFnInvokers.java:50)
at org.apache.beam.runners.direct.DoFnLifecycleManager$DeserializingCacheLoader.load (DoFnLifecycleManager.java:104)
at org.apache.beam.runners.direct.DoFnLifecycleManager$DeserializingCacheLoader.load (DoFnLifecycleManager.java:91)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture (LocalCache.java:3528)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync (LocalCache.java:2277)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad (LocalCache.java:2154)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get (LocalCache.java:2044)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get (LocalCache.java:3952)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad (LocalCache.java:3974)
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get (LocalCache.java:4958)
at org.apache.beam.runners.direct.DoFnLifecycleManager.get (DoFnLifecycleManager.java:61)
at org.apache.beam.runners.direct.ParDoEvaluatorFactory.createEvaluator (ParDoEvaluatorFactory.java:129)
at org.apache.beam.runners.direct.ParDoEvaluatorFactory.forApplication (ParDoEvaluatorFactory.java:79)
at org.apache.beam.runners.direct.TransformEvaluatorRegistry.forApplication (TransformEvaluatorRegistry.java:169)
at org.apache.beam.runners.direct.DirectTransformExecutor.run (DirectTransformExecutor.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
at java.util.concurrent.FutureTask.run (FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.NoSuchMethodError: org.elasticsearch.client.RestClient.performRequest(Ljava/lang/String;Ljava/lang/String;[Lorg/apache/http/Header;)Lorg/elasticsearch/client/Response;
at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO.getBackendVersion (ElasticsearchIO.java:1348)
at org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$Write$WriteFn.setup (ElasticsearchIO.java:1200)
What I added in the pom.xml is:我在 pom.xml 中添加的是:
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
<version>${beam.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>${elastic.version}</version>
</dependency>
I'm stuck with this problem and I don't know how to solve it.我被这个问题困住了,我不知道如何解决它。 If I use a JestClient, I'm able to connect to ElasticSearch without any issue.
如果我使用 JestClient,我可以毫无问题地连接到 ElasticSearch。
Have you any suggestion?你有什么建议吗?
You are using a newer version of RestClient
that does not have the method performRequest(String, Header)
.您正在使用没有方法
performRequest(String, Header)
的较新版本的RestClient
。 If you look at the latest source code , you can see that the method takes a Request
now, whereas in older versions there were methods that took Strings and Headers .如果您查看最新的源代码,您可以看到该方法现在接受一个
Request
,而在旧版本中,有一些方法接受 Strings 和 Headers 。 These methods were deprecated and then removed from the code on September 1, 2018 .这些方法已被弃用,然后于 2018 年 9 月 1 日从代码中删除。
Either change your code to use the newer Elastic Search library, or specify an older version of the library (it needs to be before 7.0.x
, eg 6.8.4
) that is compatible with your code.更改您的代码以使用较新的 Elastic Search 库,或指定与您的代码兼容的旧版本的库(它需要在
7.0.x
之前,例如6.8.4
)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.