简体   繁体   中英

Confluent Schema Registry Avro Schema

Hey I would like to use the Confluent schema registry with the Avro Serializers: The documentation now basically says: do not use the same schema for multiple different topics

Can anyone explain to me why? I reasearch the source code and it basically stores the schema in a kafka topic as follows (topicname,magicbytes,version->key) (schema->value)

Therefore I don't see the problem of using the schema multiple times expect redundancy?

I think you are referring to this comment in the documentation:

We recommend users use the new producer in org.apache.kafka.clients.producer.KafkaProducer . If you are using a version of Kafka older than 0.8.2.0, you can plug KafkaAvroEncoder into the old producer in kafka.javaapi.producer . However, there will be some limitations. You can only use KafkaAvroEncoder for serializing the value of the message and only send value of type Avro record. The Avro schema for the value will be registered under the subject recordName-value , where recordName is the name of the Avro record. Because of this, the same Avro record type shouldn't be used in more than one topic.

First, the commenter above is correct -- this only refers to the old producer API pre-0.8.2. It's highly recommended that you use the new producer anyway as it is a much better implementation, doesn't depend on the whole core jar, and is the client which will be maintained going forward (there isn't a specific timeline yet, but the old producer will eventually be deprecated and then removed).

However, if you are using the old producer, this restriction is only required if the schema for the two subjects might evolve separately. Suppose that you did write two applications that wrote to different topics, but use the same Avro record type, let's call it record . Now both applications will register it/look it up under the subject record-value and get assigned version=1 . This is all fine as long as the schema doesn't change. But lets say application A now needs to add a field. When it does so, the schema will be registered under subject record-value and get assigned version=2 . This is fine for application A, but application B has either not been upgraded to handle this schema, or worse, the schema isn't even valid for application B. However, you lose the protection the schema registry normally gives you -- now some other application could publish data of that format into the topic used by application B (it looks ok because record-value has that schema registered). Now application B could see data which it doesn't know how to handle since its not a schema it supports.

So the short version is that because with the old producer the subject has to be shared if you also use the same schema, you end up coupling the two applications and the schemas they must support. You can use the same schema across topics, but we suggest not doing so since it couples your applications (and their development, the teams developing them, etc).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM