简体   繁体   中英

Can KSQL populate streams with default values provided in the schema?

I have a KSQL stream that I'd like to populate with the default values specified in the schema. Other than manually specifying them again with coalesce statements, is there a way to do this?

The previously answered question is similar to the issue I am facing, but it doesn't address the main point of using the default values already specified in the schema:

Create KSQL stream with default values for a column?

I did the following (based on the documentation provided by Confluent: https://docs.confluent.io/platform/current/schema-registry/serdes-develop/serdes-avro.html#schema-references-in-avro ):

  1. Created a topic t1-a with a schema
kafka-avro-console-producer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081 --topic t1-a \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"name","type":"string","default":"no-name"}]}
  1. Set the compatibility to FULL on the subject (using the schema-registry REST API)
  2. Generated records to the topic using the CLI tool
{"name":"john"}
{"name":"doe"}
  1. Updated the schema using the Schema-registry
kafka-avro-console-producer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081 --topic t1-a \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"name","type":"string", "default":"no-name"}, {"name":"age","type":"string", "default":"ageless-wonder"}]}
  1. Generated records to the topic using the CLI tool:
{"name":"jack", "age":"100"}
{"name":"jill", "age":"101"}
  1. started ksql cli and created a stream
CREATE STREAM t1_a WITH (KAFKA_TOPIC='t1-a',VALUE_FORMAT='AVRO');
  1. Queried Records:
SELECT * FROM t1_a;

Now I get the records, but the value in Age for John and Doe are listed as null (instead of the default value "ageless-wonder" specified in the schema):

NAME    AGE
john    null
doe     null
jack    100
jill    101

I understand that I can coalesce the values to default in the stream definition, but is there a way to have that field populate based on the schema already provided?

Ksql uses the schema ID that was originally sent with the data, rather than the latest, to deserialize the record and build each row. You'd need to define your first schema with a default age, then send without.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM