Avro and Kafka by making use of SchemaBuilder

Question

I went through the tutorial from baeldung . They mention there are two ways to create a schema.

By writing the json representation and adding the maven plugin to produce the class
By using the SchemaBuilder , which they also mention is a better choice.

Unfortunately in the git example I only see the json way.

Lets say I have this Avro schema:

{
  "type":"record",
  "name":"TestFile",
  "namespace":"com.example.kafka.data.ingestion.model",
  "fields":[
    {
      "name":"date",
      "type":"long"
    },
    {
      "name":"counter",
      "type":"int"
    },
    {
      "name":"mc",
      "type":"string"
    }
  ]
}

By adding this plugin in my pom file:

<plugin>
   <groupId>org.apache.avro</groupId>
   <artifactId>avro-maven-plugin</artifactId>
   <version>1.8.0</version>
   <executions>
      <execution>
         <id>schemas</id>
         <phase>generate-sources</phase>
         <goals>
            <goal>schema</goal>
            <goal>protocol</goal>
            <goal>idl-protocol</goal>
         </goals>
         <configuration>
            <sourceDirectory>${project.basedir}/src/main/resources/</sourceDirectory>
            <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
         </configuration>
      </execution>
   </executions>
</plugin>

and building with generate-sources a TestFile.java is created to the destination I said. Then for sending to a kafka topic I can do the following:

TestFile test = TestFile.newBuilder()
                                            .setDate(102928374747)
                                            .setCounter(2)
                                            .setMc("Some string")
                                            .build();
kafkaTemplate.send(topicName, test);

The equivalent of creating the schema with SchemaBuilder would be:

Schema testFileSchema = SchemaBuilder   .record("TestFile")
                                            .namespace("com.example.kafka.data.ingestion.model")
                                            .fields()
                                            .requiredLong("date")
                                            .requiredInt("counter")
                                            .requiredString("mc")
                                            .endRecord();

But how can I now generate the POJO and send my TestFile data to my kafka topic?

Answer 1

You won't have access to a TestFile object since the Schema is made at runtime, not pre-compiled. If you want to keep that POJO, then you would need a constructor for public TestFile(GenericRecord avroRecord)

You'll need to create a GenericRecord using that Schema object, same as if you were parsing it from a String or a file.

For example,

Schema schema = SchemaBuilder.record("TestFile")
            .namespace("com.example.kafka.data.ingestion.model")
            .fields()
            .requiredLong("date")
            .requiredInt("counter")
            .requiredString("mc")
            .endRecord();

GenericRecord entry1 = new GenericData.Record(schema);
entry1.put("date", 1L);
entry1.put("counter", 2);
entry1.put("mc", "3");

// producer.send(new ProducerRecord<>(topic, entry1);

A full Kafka example is available from Confluent

If you put don't include a required field, it'll throw an error, and the values of the types are not checked (I could put "counter", "2" , and it would send a string value (this seems to be a bug to me). Basically, GenericRecord == HashMap<String, Object> with the added benefit of reqiured/nullable fields.

And you will need to configure an Avro serializer, such as Confluent's, which requires running their Schema Registry, or a version like Cloudera shows

Otherwise, you need to convert the Avro object into a byte[] (as shown in your linkand just use the ByteArraySerializer

Answer 2

As stated in the Baeldung tutorial:

Later we can apply the toString method to get the JSON structure of Schema.

So for example using this code inside a main class you can print the two schemas definition to the console output.

You can then save the resulting json representations to .avsc file and generate pojos as before.

    Schema clientIdentifier = SchemaBuilder.record("ClientIdentifier")
            .namespace("com.baeldung.avro")
            .fields().requiredString("hostName").requiredString("ipAddress")
            .endRecord();
    System.out.println(clientIdentifier.toString());

    Schema avroHttpRequest = SchemaBuilder.record("AvroHttpRequest")
            .namespace("com.baeldung.avro")
            .fields().requiredLong("requestTime")
            .name("clientIdentifier")
            .type(clientIdentifier)
            .noDefault()
            .name("employeeNames")
            .type()
            .array()
            .items()
            .stringType()
            .arrayDefault(new ArrayList<>())
            .name("active")
            .type()
            .enumeration("Active")
            .symbols("YES","NO")
            .noDefault()
            .endRecord();
    System.out.println(avroHttpRequest.toString());

There is a third way to generate Avro schemas that is using Avro IDL

Avro and Kafka by making use of SchemaBuilder

Question

2 answers

solution1
1 ACCPTED 2019-01-16 05:50:09

solution2
0 2021-10-29 07:54:54

Avro and Kafka by making use of SchemaBuilder

Question

2 answers

solution1 1 ACCPTED 2019-01-16 05:50:09

solution2 0 2021-10-29 07:54:54

solution1
1 ACCPTED 2019-01-16 05:50:09

solution2
0 2021-10-29 07:54:54