简体   繁体   中英

How to handle schemas?

I have a small Spring-Boot-based prototype to publish messages to a Kafka cluster using Protobuf. I'm using the confluent serializer:

  • io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer
  • io.confluent.kafka.serializers.protobuf.KafkaProtobufDeserializer

I'm also running the Schema Registry from Confluent (latest version) to handle the Protobuf schemas. Everything works as expected.

Now, I would like to introduce the Cloudevents spec ( https://github.com/cloudevents/spec ), but I'm struggling to understand how it can work with the Confluent Schema Registry.

Cloudevents has an sdk module to serialize a message directly to Protobuf. The data section of the message is where my versioned payload should go, but there is no way to define a schema only for a section of the message. To be more clear:

 CloudEvent event = CloudEventBuilder.v1()
                .withId(UUID.randomUUID().toString())
                .withType("example.vertx")
                .withSource(URI.create("http://localhost"))
                .withData(???) <-- HERE IS WHERE MY PAYLOAD SHOULD BE VERSIONED
                .build();

One solution is to replicate the Cloudevent protobuf schema and simply define the message specification in each protobuf schema file. This has the disadvantage that I have to copy/paste the Cloudevents protobuf schema for each new message. This will allow me to use the standard Protobuf Kafka serde without using any Cloudevent library. Is there a better solution?

If you're using Kafka, you should be looking at the CloudEvents Kafka Protocol Spec , which will have its own Kafka Serializer classes .

If you read though that, it'll refer to Binary datacontenttype , and headers like application/cloudevents+avro , which could be suffixed with +protobuf .

If I read the spec correctly, the Kafka value itself "MUST" be JSON format, and the data for your actual payload event can be binary encoded (as base64 string, I guess? Since JSON doesn't have a binary type)

Basically, what happens is that you'd need to manually serialize the Protobuf events using the classes you mentioned, and that communicates with the Schema Registry. Then stick that in the CloudEvent records, and finally use some "CloudEventSerializer" and produce...
Then do the opposite on the other side; extracting the data payload from the value, and passing that onto KafkaProtobufDeserializer.deserialize method.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM