简体   繁体   English

Avro 和 Kafka 通过使用 SchemaBuilder

[英]Avro and Kafka by making use of SchemaBuilder

I went through the tutorial from baeldung .我浏览了baeldung的教程。 They mention there are two ways to create a schema.他们提到有两种方法可以创建模式。

  • By writing the json representation and adding the maven plugin to produce the class通过编写json表示并添加maven插件来生成类
  • By using the SchemaBuilder , which they also mention is a better choice.通过使用SchemaBuilder ,他们也提到这是一个更好的选择。

Unfortunately in the git example I only see the json way.不幸的是,在 git 示例中,我只看到了 json 方式。

Lets say I have this Avro schema:假设我有这个 Avro 架构:

{
  "type":"record",
  "name":"TestFile",
  "namespace":"com.example.kafka.data.ingestion.model",
  "fields":[
    {
      "name":"date",
      "type":"long"
    },
    {
      "name":"counter",
      "type":"int"
    },
    {
      "name":"mc",
      "type":"string"
    }
  ]
}

By adding this plugin in my pom file:通过在我的 pom 文件中添加这个插件:

<plugin>
   <groupId>org.apache.avro</groupId>
   <artifactId>avro-maven-plugin</artifactId>
   <version>1.8.0</version>
   <executions>
      <execution>
         <id>schemas</id>
         <phase>generate-sources</phase>
         <goals>
            <goal>schema</goal>
            <goal>protocol</goal>
            <goal>idl-protocol</goal>
         </goals>
         <configuration>
            <sourceDirectory>${project.basedir}/src/main/resources/</sourceDirectory>
            <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
         </configuration>
      </execution>
   </executions>
</plugin>

and building with generate-sources a TestFile.java is created to the destination I said.并使用 generate-sources 创建一个TestFile.java到我说的目的地。 Then for sending to a kafka topic I can do the following:然后为了发送到 kafka 主题,我可以执行以下操作:

TestFile test = TestFile.newBuilder()
                                            .setDate(102928374747)
                                            .setCounter(2)
                                            .setMc("Some string")
                                            .build();
kafkaTemplate.send(topicName, test);

The equivalent of creating the schema with SchemaBuilder would be:使用SchemaBuilder创建模式的等价物是:

Schema testFileSchema = SchemaBuilder   .record("TestFile")
                                            .namespace("com.example.kafka.data.ingestion.model")
                                            .fields()
                                            .requiredLong("date")
                                            .requiredInt("counter")
                                            .requiredString("mc")
                                            .endRecord();

But how can I now generate the POJO and send my TestFile data to my kafka topic?但是我现在如何生成 POJO 并将我的TestFile数据发送到我的 kafka 主题?

You won't have access to a TestFile object since the Schema is made at runtime, not pre-compiled. 您将无法访问TestFile对象,因为Schema是在运行时创建的,而不是预编译的。 If you want to keep that POJO, then you would need a constructor for public TestFile(GenericRecord avroRecord) 如果你想保留那个POJO,那么你需要一个public TestFile(GenericRecord avroRecord)的构造函数public TestFile(GenericRecord avroRecord)

You'll need to create a GenericRecord using that Schema object, same as if you were parsing it from a String or a file. 您需要使用该Schema对象创建GenericRecord ,就像从String或文件中解析它一样。

For example, 例如,

Schema schema = SchemaBuilder.record("TestFile")
            .namespace("com.example.kafka.data.ingestion.model")
            .fields()
            .requiredLong("date")
            .requiredInt("counter")
            .requiredString("mc")
            .endRecord();

GenericRecord entry1 = new GenericData.Record(schema);
entry1.put("date", 1L);
entry1.put("counter", 2);
entry1.put("mc", "3");

// producer.send(new ProducerRecord<>(topic, entry1);

A full Kafka example is available from Confluent Confluent提供完整的Kafka示例

If you put don't include a required field, it'll throw an error, and the values of the types are not checked (I could put "counter", "2" , and it would send a string value (this seems to be a bug to me). Basically, GenericRecord == HashMap<String, Object> with the added benefit of reqiured/nullable fields. 如果你把不包含必填字段,它会抛出一个错误,并且不检查类型的值(我可以放"counter", "2" ,它会发送一个字符串值(这似乎是对我来说是一个错误。)基本上, GenericRecord == HashMap<String, Object>带有reqiured / nullable字段的额外好处。

And you will need to configure an Avro serializer, such as Confluent's, which requires running their Schema Registry, or a version like Cloudera shows 您需要配置Avro序列化程序,例如Confluent,它需要运行其Schema Registry,或者像Cloudera这样的版本

Otherwise, you need to convert the Avro object into a byte[] (as shown in your linkand just use the ByteArraySerializer 否则,您需要将Avro对象转换为byte[] (如链接中所示,只需使用ByteArraySerializer

As stated in the Baeldung tutorial:正如 Baeldung 教程中所述:

Later we can apply the toString method to get the JSON structure of Schema.稍后我们可以应用 toString 方法来获取 Schema 的 JSON 结构。

So for example using this code inside a main class you can print the two schemas definition to the console output.因此,例如在主类中使用此代码,您可以将两个模式定义打印到控制台输出。

You can then save the resulting json representations to .avsc file and generate pojos as before.然后,您可以将生成的 json 表示保存到 .avsc 文件并像以前一样生成 pojo。

    Schema clientIdentifier = SchemaBuilder.record("ClientIdentifier")
            .namespace("com.baeldung.avro")
            .fields().requiredString("hostName").requiredString("ipAddress")
            .endRecord();
    System.out.println(clientIdentifier.toString());

    Schema avroHttpRequest = SchemaBuilder.record("AvroHttpRequest")
            .namespace("com.baeldung.avro")
            .fields().requiredLong("requestTime")
            .name("clientIdentifier")
            .type(clientIdentifier)
            .noDefault()
            .name("employeeNames")
            .type()
            .array()
            .items()
            .stringType()
            .arrayDefault(new ArrayList<>())
            .name("active")
            .type()
            .enumeration("Active")
            .symbols("YES","NO")
            .noDefault()
            .endRecord();
    System.out.println(avroHttpRequest.toString());

There is a third way to generate Avro schemas that is using Avro IDL还有第三种使用Avro IDL生成 Avro 模式的方法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM