[英]Kafka JSON Data with Schema is Null in PySpark Structured Streaming. Mismatched input on new schema
[英]Avro schema evolution(backward compatibility) return null with pyspark structured streaming
Avro 向后兼容返回 null 记录。
我将由 schema_ver1.avsc 结束的记录和由 schema_ver2.avsc 编码的记录发送到 kafka,然后,我查询 pyspark 流式传输 memory 接收器名为 avro_sink3 由 ver2 解码(avro_json_schema_ver2)
我期待这样的记录。
预期记录
+-----------------------------------+
|from_avro(value) |
+-----------------------------------+
|{ver1_yuki_schema, login, 4, 1} |
|{ver2_yuki_schema, login2, 4, 1000}|
+-----------------------------------+
但是,我得到了 output ,如下所示。
实际输出1
+-----------------------------------+
|from_avro(value) |
+-----------------------------------+
|{null, null, null, null} | -> endoded by schema_ver1.avsc decoded by schema_ver2.avsc(record No.2)
|{ver2_yuki_schema, login2, 4, 1000}| -> endoded by schema_ver2.avsc decoded by schema_ver2.avsc(record No.2)
+-----------------------------------+
我应该如何解决这个问题。
pyspark 流水槽
memory_stream_check29 = df \
.select("value").alias("value") \
.writeStream \
.format("memory") \
.trigger(processingTime="5 seconds") \
.option("checkpointLocation", "/tmp/kafka/avro_file11131/") \
.queryName("avro_sink3") \
.start()
查询 memory 水槽
spark.sql("select * from avro_sink3").select(from_avro("value",avro_json_schema_ver2, {"mode" : "PERMISSIVE"})).show(truncate=False)
schema_ver1.avsc
{
"namespace": "root",
"type": "record",
"name": "Device",
"fields": [
{ "name": "id", "type": "string" },
{ "name": "type", "type": "string" },
{ "name": "sendtime", "type": "int" },
]
}
schema_ver2.avsc
{
"namespace": "root",
"type": "record",
"name": "Device",
"fields": [
{ "name": "id", "type": "string" },
{ "name": "type", "type": "string" },
{ "name": "sendtime", "type": "int" },
{ "name": "temp", "type": "string", "default": "1" }
]
}
环境火花3.2
from_avro 期望作者的模式可用于反序列化。 您不能使用 avro_json_schema_ver2 反序列化使用 avro_json_schema_ver1 写入的记录。 另一种解决方案是使用所有模式反序列化,方法是在记录头中使用模式标识符,然后 map 为每个记录提供正确的模式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.