Lagom服务消耗来自Kafka的输入

Question

I am trying to figure out how Lagom can be used to consume data from external systems communicating over Kafka. 我试图弄清楚如何使用Lagom来消费来自通过Kafka进行通信的外部系统的数据。

I've ran into this section of Lagom documentation , which describes how Lagom service can communicate with another Lagom service by subscribing to its topic. 我已经遇到了Lagom文档的这一部分，它描述了Lagom服务如何通过订阅其主题与另一个Lagom服务进行通信。

helloService
  .greetingsTopic()
  .subscribe // <-- you get back a Subscriber instance
  .atLeastOnce(
  Flow.fromFunction(doSomethingWithTheMessage)
)

However, what is the appropriate configuration when you want to subscribe to a Kafka topic that contains events produced by some random, external system? 但是，当您想要订阅包含某个随机外部系统生成的事件的Kafka主题时，适当的配置是什么？

Is some sort of adapter needed for this functionality? 这个功能需要某种适配器吗？ To clarify, I have this at the moment: 为了澄清，我现在有这个：

object Aggregator {
  val TOPIC_NAME = "my-aggregation"
}

trait Aggregator extends Service {
  def aggregate(correlationId: String): ServiceCall[Data, Done]

  def aggregationTopic(): Topic[DataRecorded]

  override final def descriptor: Descriptor = {
    import Service._

    named("aggregator")
      .withCalls(
        pathCall("/api/aggregate/:correlationId", aggregate _)
      )
      .withTopics(
        topic(Aggregator.TOPIC_NAME, aggregationTopic())
          .addProperty(
            KafkaProperties.partitionKeyStrategy,
            PartitionKeyStrategy[DataRecorded](_.sessionData.correlationId)
          )
      )
      .withAutoAcl(true)
  }
}

And I can invoke it via simple POST request. 我可以通过简单的POST请求调用它。 However, I would like for it to be invoked by consuming Data messages from some (external) Kafka topic. 但是，我希望通过使用来自某些（外部）Kafka主题的Data消息来调用它。

I was wondering if there is such a way to configure the descriptor in a fashion similar to this mockup: 我想知道是否有这种方式以类似于这个模型的方式配置描述符：

override final def descriptor: Descriptor = {
  ...
  kafkaTopic("my-input-topic")
    .subscribe(serviceCall(aggregate _)
    .withAtMostOnceDelivery
}

I've ran into this discussion on Google Groups , but in the OPs questions, I do not see he is actually doing anything with EventMessage s coming from some-topic except routing them to the topic defined by his service. 我已经在Google网上讨论了这个问题，但是在OPs的问题中，我并没有看到他实际上在使用EventMessage来做任何事情，除了将它们路由到他的服务所定义some-topic之外。

EDIT #1: Progress update 编辑＃1：进度更新

Looking at the documentation, I decided to try the following approach. 看一下文档，我决定尝试以下方法。 I added 2 more modules, aggregator-kafka-proxy-api and aggregator-kafka-proxy-impl . 我添加了2个模块， aggregator-kafka-proxy-api和aggregator-kafka-proxy-impl 。

In new api module, I defined a new service, with no methods, but one topic which would represent my Kafka topic: 在新的api模块中，我定义了一个没有方法的新服务，但是一个主题代表了我的Kafka主题：

object DataKafkaPublisher {
  val TOPIC_NAME = "data-in"
}

trait DataKafkaPublisher extends Service {
  def dataInTopic: Topic[DataPublished]

  override final def descriptor: Descriptor = {
    import Service._
    import DataKafkaPublisher._

    named("data-kafka-in")
      .withTopics(
        topic(TOPIC_NAME, dataInTopic)
          .addProperty(
            KafkaProperties.partitionKeyStrategy,
            PartitionKeyStrategy[SessionDataPublished](_.data.correlationId)
          )
      )
      .withAutoAcl(true)
  }
}

In the impl module, I simply did the standard implementation 在impl模块中，我只是做了标准实现

class DataKafkaPublisherImpl(persistentEntityRegistry: PersistentEntityRegistry) extends DataKafkaPublisher {
  override def dataInTopic: Topic[api.DataPublished] =
    TopicProducer.singleStreamWithOffset {
      fromOffset =>
        persistentEntityRegistry.eventStream(KafkaDataEvent.Tag, fromOffset)
          .map(ev => (convertEvent(ev), ev.offset))
    }

  private def convertEvent(evt: EventStreamElement[KafkaDataEvent]): api.DataPublished = {
    evt.event match {
      case DataPublished(data) => api.DataPublished(data)
    }
  }
}

Now, to actually consume these events, in my aggregator-impl module, I added a "subscriber" service, which takes these events, and invokes appropriate commands on entity. 现在，为了实际使用这些事件，在我的aggregator-impl模块中，我添加了一个“订阅者”服务，它接受这些事件，并在实体上调用适当的命令。

class DataKafkaSubscriber(persistentEntityRegistry: PersistentEntityRegistry, kafkaPublisher: DataKafkaPublisher) {

  kafkaPublisher.dataInTopic.subscribe.atLeastOnce(
    Flow[DataPublished].mapAsync(1) { sd =>
      sessionRef(sd.data.correlationId).ask(RecordData(sd.data))
    }
  )

  private def sessionRef(correlationId: String) =
    persistentEntityRegistry.refFor[Entity](correlationId)
}

This effectively allowed me to publish a message on Kafka topic "data-in", which was then proxied and converted to RecordData command before issued to the entity to consume. 这实际上允许我在Kafka主题“data-in”上发布消息，然后将其代理并转换为RecordData命令，然后发布给要使用的实体。

However, it seems somewhat hacky to me. 但是，对我来说，这似乎有些苛刻。 I am coupled to Kafka by Lagom internals. 我与Lagom internals联系到Kafka。 I cannot swap the source of my data easily. 我无法轻易交换数据来源。 For example, how would I consume external messages from RabbitMQ if I wanted to? 例如，如果我愿意，我将如何使用来自RabbitMQ的外部消息？ What if I'm trying to consume from another Kafka (different one than used by Lagom)? 如果我想从另一个Kafka消费（不同于Lagom使用的那个）怎么办？

Edit #2: More docs 编辑＃2：更多文档

I've found a few articles on Lagom docs, notably, this: 我在Lagom文档上发现了一些文章，特别是这个：

Consuming Topics from 3rd parties 从第三方消费主题

You may want your Lagom service to consume data produced on services not implemented in Lagom. 您可能希望您的Lagom服务使用在Lagom中未实现的服务上生成的数据。 In that case, as described in the Service Clients section, you can create a third-party-service-api module in your Lagom project. 在这种情况下，如“服务客户端”部分所述，您可以在Lagom项目中创建第三方服务API模块。 That module will contain a Service Descriptor declaring the topic you will consume from. 该模块将包含一个服务描述符，声明您将使用的主题。 Once you have your ThirdPartyService interface and related classes implemented, you should add third-party-service-api as a dependency on your fancy-service-impl. 一旦实现了ThirdPartyService接口和相关类，就应该添加第三方服务api作为对fancy-service-impl的依赖。 Finally, you can consume from the topic described in ThirdPartyService as documented in the Subscribe to a topic section. 最后，您可以使用ThirdPartyService中描述的主题，如订阅主题部分中所述。

Answer 1

I don't use lagom so this is maybe just an idea. 我不使用lagom所以这可能只是一个想法。 But as akka-streams is part of lagom (at least I assume that) - to get from this solution to what you need should be easy. 但作为akka-streams是部分lagom （至少我认为） -从这个解决方案得到你所需要的应该很容易。

I used akka-stream-kafka and this went really nice (I only did a Prototype) 我使用了akka-stream-kafka ，这非常好（我只做了一个Prototype）

As you consume messages you would do something: 在使用消息时，您可以执行以下操作：

     Consumer
      .committableSource(
          consumerSettings(..), // config of Kafka
          Subscriptions.topics("kafkaWsPathMsgTopic")) // Topic to subscribe
      .mapAsync(10) { msg =>
        business(msg.record) // do something
      }

Check the well written documentation 检查写得很好的文档

My whole example you find here: PathMsgConsumer 我在这里找到的完整示例： PathMsgConsumer

Answer 2

An answer was provided by Alan Klikic on Lightbend discussion forums here . 答案由Alan Klikic上Lightbend论坛提供了这里。

Part 1: 第1部分：

If you are only using external Kafka cluster in your business service then you can implement this using only Lagom Broker API. 如果您只在业务服务中使用外部Kafka群集，那么您可以仅使用Lagom Broker API实现此功能。 So you need to: 所以你需要：

create API with service descriptor with only topic definition (this API is not beeing implemented) 使用仅包含主题定义的服务描述符创建API（此API未实现）

in your business service configure kafka_native depending on your deployment (as i mentioned in previous post) 在您的业务服务中根据您的部署配置kafka_native（正如我在上一篇文章中提到的那样）

in your business service inject service from API created in #1 and subscribe to it using Lagom Broker API subscriber 在您的业务服务中，从＃1中创建的API注入服务，并使用Lagom Broker API订阅者订阅它

Offset commiting, in Lagom Broker API subscriber is handled out-of-the-box. 在Lagom Broker API订户中的抵消提交是开箱即用的。

Part 2: 第2部分：

Kafka and AMQP consumer implementations require persistant akka stream. Kafka和AMQP消费者实施需要持久的akka流。 So you need to handle disconnects. 所以你需要处理断开连接。 These can be done in two ways: 这些可以通过两种方式完成：

control peristant akka stream by wraping it in an actor. 通过将其包裹在演员中来控制peristant akka流。 You initialize you stream Flow on actor preStart and pipe stream complete to the actor that will stop it. 您初始化您在actor preStart和流管道流上的流完成到将停止它的actor。 If stream completes or fails actor will stop. 如果流完成或失败，则actor将停止。 Then wrap actor in actor backoff with restart strategy, that will restart the actor in case of complete or fail and reinitialize the Flow 然后使用重新启动策略将actor包含在actor backoff中，这将在完成或失败的情况下重新启动actor并重新初始化Flow

akka streams Delayed restarts with backoff stage akka streams延迟重启与退避阶段

Personnaly I use #1 and did not try #2 yet. Personnaly我使用＃1并没有尝试＃2。

Initializing backoff actor for #1 or Flow for #2 can be done in your Lagom components trait (basically in the same place where you do your subscribe now using Lagom Broker API). 初始化＃1的倒流actor或＃2的流可以在您的Lagom组件特征中完成（基本上在您使用Lagom Broker API进行订阅的同一位置）。

Be sure to set a consumer group when configuring consumer to ensure avoiding duplicate consuming. 确保在配置使用者时设置使用者组以确保避免重复使用。 You can use, like Lagom does, service name from descriptor as consumer group name. 您可以像Lagom一样使用描述符中的服务名称作为使用者组名称。

Lagom服务消耗来自Kafka的输入

问题描述

EDIT #1: Progress update 编辑＃1：进度更新

Edit #2: More docs 编辑＃2：更多文档

2 个解决方案

解决方案1
1 2019-02-06 19:09:55

解决方案2
0 已采纳 2019-02-19 09:56:29

Lagom服务消耗来自Kafka的输入

问题描述

EDIT #1: Progress update 编辑＃1：进度更新

Edit #2: More docs 编辑＃2：更多文档

2 个解决方案

解决方案1 1 2019-02-06 19:09:55

解决方案2 0 已采纳 2019-02-19 09:56:29

解决方案1
1 2019-02-06 19:09:55

解决方案2
0 已采纳 2019-02-19 09:56:29