简体   繁体   中英

How do I decode the schema id from avro event in Kafka wire format?

Hope you all doing well.

I'm ingesting data from a Kafka topic where I have multiple event types and schemas for each type. As Kafka is using wire format , the first byte is the magic byte, from byte 1 to 4 we have the schema-id and after the 5th byte we have the data itself.

I want to decode the schema-id so I'll be able to get the schema from schema-registry. How can I do that with Python?

For example, if I have b'\x00\x00\x00\x04' as the schema-id binary, how can I decode this binary so I can get the actual value of the schema-id?

Given the full five byte header, you should be able to do the following:

from struct import unpack
magic, schema_id = unpack('>bI', header_bytes)

This was taken from the way that the confluent-kafka-python library grabs the schema_id : https://github.com/confluentinc/confluent-kafka-python/blob/e671bccb8a4f98302748ccf60d5d579f68c6613d/src/confluent_kafka/schema_registry/avro.py#L315

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM