简体   繁体   English

Argo Events Kafka 触发器无法解析消息头以启用分布式跟踪

[英]Argo Events Kafka triggers cannot parse message headers to enable distributed tracing

TL;DR - Argo Events Kafka eventsource triggers do not currently parse headers of consumed Kafka message, which is needed to enable distributed tracing. TL;DR - Argo Events Kafka 事件源触发器当前不解析使用的 Kafka 消息的标头,这是启用分布式跟踪所必需的。 I submitted a feature request ( here ) - if you face the same problem please upvote, and curious if anyone figured out a workaround.我提交了一个功能请求(在这里)——如果你遇到同样的问题,请投票,并且好奇是否有人想出了一个解决方法。

==================================== ====================================

Context语境

Common pattern of Argo Workflows we deploy are Kafka event-driven, asynchronous distributed workloads, eg:我们部署的 Argo Workflows 的常见模式是 Kafka 事件驱动的异步分布式工作负载,例如:

  • Service "A" Kafka producer that emits message to topic向主题发送消息的服务“A”Kafka 生产者
  • Argo Events eventsource Kafka trigger listening to that topic Argo Events eventsource Kafka 触发器监听该主题
  • Argo Workflow gets triggered, and post-processing... Argo Workflow 被触发,并进行后处理......
  • ... service "B" Kafka producer at end of workflow emits that work is done. ...服务“B”Kafka 生产者在工作流结束时发出工作已完成的消息。

To monitor the entire system for user-centric metrics "how long did it take & where are the bottle necks", I'm looking to instrument distributed tracing from service "A" to service "B".为了监控整个系统的以用户为中心的指标“它花了多长时间以及瓶颈在哪里”,我正在寻找从服务“A”到服务“B”的分布式跟踪。 We use Datadog as aggregator, with dd-trace .我们使用 Datadog 作为聚合器,使用dd-trace

Pattern I've seen is manual propagation of trace ctx via Kafka headers - by injecting headers to Kafka messages before emitting (similar to HTTP headers, with parent trace metadata), and receiving Consumer once done processing the message will then add child_span to that parent_span received from upstream.我见过的模式是通过 Kafka 标头手动传播 trace ctx - 通过在发出之前将标头注入 Kafka 消息(类似于 HTTP 标头,带有父跟踪元数据),并在完成消息处理后接收 Consumer 然后将 child_span 添加到该 parent_span从上游收到。

ex) of above: https://newrelic.com/blog/how-to-relic/distributed-tracing-with-kafka ex) 以上: https://newrelic.com/blog/how-to-relic/distributed-tracing-with-kafka


Issue问题

Argo-Events Kafka event source trigger does not parse any headers, only passing the body json for downstream Workflow to use at eventData.Body . Argo-Events Kafka 事件源触发器不解析任何 headers,只传递 body json 供下游 Workflow 在eventData.Body使用。
[source code] [源代码]

Simplified views of my Argo Eventsource -> Trigger -> Workflow:我的 Argo Eventsource -> Trigger -> Workflow 的简化视图:

# eventsource/my-kafka-eventsource.yaml
apiVersion: argoproj.io/v1alpha1
kind: EventSource
spec:
  kafka:
    my-kafka-eventsource:
      topic: <my-topic>
      version: "2.5.0"
# sensors/trigger-my-workflow.yaml
apiVersion: argoproj.io/v1alpha1
kind: Sensor
spec:
  dependencies:
    - name: my-kafka-eventsource-dep
      eventSourceName: my-kafka-eventsource
      eventName: my-kafka-eventsource
  triggers:
    - template:
        name: start-my-workflow
        k8s:
          operation: create
          source:
            resource:
              apiVersion: argoproj.io/v1alpha1
              kind: Workflow
              spec:
                entrypoint: my-sick-workflow
                arguments:
                  parameters:
                    - name: proto_message
                      value: needs to be overriden
                    # I would like to be able to add this
                    - name: msg_headers
                      value: needs to be overriden
                templates:
                  - name: my-sick-workflow
                    dag:
                      tasks:
                        - name: my-sick-workflow
                          templateRef:
                            name: my-sick-workflow
                            template: my-sick-workflow
          parameters:
            # content/body of consumed message
            - src: 
                dependencyName: my-kafka-eventsource-dep
                dataKey: body  
              dest: spec.arguments.parameters.0.value
            # I would like to do this - get msg.headers() if exists.
            - src: 
                dependencyName: my-kafka-eventsource-dep
                dataKey: headers
              dest: spec.arguments.parameters.1.value
# templates/my-sick-workflow.yaml
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
spec:
  templates:
    - name: my-sick-workflow
      container:
        image: <image>
        command: [ "python", "/main.py" ] 
        # I want to add the 2nd arg - msg_headers - here
        args: [ "{{workflow.parameters.proto_message}}", "{{workflow.parameters.msg_headers}}" ]

# so that in my Workflow Dag step source code, 
# I can access headers of Kafka msg from upstream by.... 
# body=sys.argv[1], headers=sys.argv[2]

Confluent-Kafka API docs on accessing message headers: [ doc ] Confluent-Kafka API有关访问消息标头的文档:[ doc ]


Q's问的

  1. Has anyone found a workaround on passing tracing context from upstream to downstream service that travels between Kafka Producer<>Argo Events?有没有人找到将跟踪上下文从上游传递到在 Kafka Producer<>Argo Events 之间传输的下游服务的解决方法?

  2. I considered changing my Argo-Workflows sensor trigger to HTTP trigger accepting payloads, by a new Kafka consumer listening for the message that is currently triggering my Argo Workflow --> then forward HTTP payload with parent trace metadata in headers.我考虑将我的 Argo-Workflows 传感器触发器更改为 HTTP 触发器接受有效载荷,通过一个新的 Kafka 消费者侦听当前触发我的 Argo Workflow 的消息 --> 然后转发 HTTP 有效载荷,在标头中包含父跟踪元数据。

    • it's anti-pattern to rest of my workflows, so I would like to avoid if there's a simpler solution.它是我工作流程 rest 的反模式,所以如果有更简单的解决方案,我想避免。

As you pointed out, the only real workaround without forking some part of Argo Events, or implementing your own Source/Sensor yourself would be to use a Kafka Consumer (or Kafka Connect), and call a WebHook EventSource (or another, which can extract the information you need).正如您所指出的,唯一真正的解决方法是在不分叉 Argo Events 的某些部分或自己实现自己的源/传感器的情况下使用 Kafka Consumer(或 Kafka Connect),并调用 WebHook EventSource(或另一个可以提取您需要的信息)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM