简体   繁体   English

Kafka Avro Schema的演变

[英]Kafka Avro Schema evolution

I am trying to learn more about the Avro schemas which we use for our Kafka topics and I am relatively new to this. 我正在尝试了解有关我们用于Kafka主题的Avro模式的更多信息,对此我还比较陌生。

I was wondering is there a way to evolve schemas in a particular situation. 我想知道是否有一种方法可以在特定情况下演化模式。 We update our schema with a new field that can't be null or any default values because these new fields are identifiers. 我们使用不能为null或任何默认值的新字段更新架构,因为这些新字段是标识符。 The workaround to solve this is to create new topics, but is there a better way to evolve existing schemas? 解决此问题的解决方法是创建新主题,但是是否有更好的方法来发展现有架构?

There are four possible compatibility in topic: - Forward : a client which await the old version of the schema can read the new version - Backward : a client which await the new version of the schema can read the old version - Both : both above - None : none of above 主题中有四种可能的兼容性:- Forward :一个等待模式旧版本的客户端可以读取新版本- Backward :一个等待模式新版本的客户端可以读取旧版本- Both :都在上面- None :以上都不是

Consider that there are some times where some producer will produce old and new data, and consumer will except new or old data. 考虑到有些时候某些生产者将生产旧数据和新数据,而消费者将生产新数据或旧数据除外。

How would behave clients in your case? 您的客户表现如何?

  • adding a field is always forward compatible (old clients just drop the new field) 添加字段始终是向前兼容的(老客户只需删除新字段)
  • it is backward compatible only if you specify a default value 仅当您指定默认值时,它才向后兼容

Also, this is only true if you are planning to convert data to a specific schema (with the corresponing POCO for example) - if you just convert it to json and make custom treatment, you could have a new client process both schema. 同样,仅当您计划将数据转换为特定的模式(例如,以相应的POCO)时,这才是正确的-如果仅将其转换为json并进行自定义处理,则可以让新的客户端处理这两个模式。

So two possibe ways for me to wrte to same topic: 因此,我有两种可能的方式来写同一主题:

  • you set a default value. 您设置默认值。 You may be misunderstanding default values, it doesn't mean a default value will be written, but (quoting avro specs) 您可能会误解默认值,这并不意味着将写入默认值,而是(引用avro规范)

    A default value for this field, used when reading instances that lack this field (optional) 该字段的默认值,在读取缺少该字段的实例时使用(可选)

For example, if you previously had a "name" and want to add "surname", you can set "surname" default as "NC" (or empty), as you may have done in a database. 例如,如果您以前有一个“名称”并想添加“姓”,则可以像在数据库中一样将“姓”默认设置为“ NC”(或为空)。

  • You set your compatibility default to none (or forward ), so that you can update your schema (as by default, comptibiliaty is backward ). 您将兼容性默认值设置为none (或forward ),以便您可以更新架构(默认情况下,complibiliaty为backward )。 In this case, client awaiting the new schema won't be able to process old data. 在这种情况下,等待新模式的客户端将无法处理旧数据。 But it could fit your usage if you only process incoming data (change compatibility, update all your producer (so that only new data will arrive), then your clients awaiting the new schema - remember to set compatibility back to backward or the compatibility your really want 但是,如果您只处理传入数据(更改兼容性,更新所有生产者(以便仅新数据到达),然后让客户等待新模式,则它可能适合您的用法-请记住将兼容性设置为向后或实际设置为想

I would go with option 1. 我会选择选项1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM