I am new to Avro Schema. I had created the following schema based upon reference JSON but I am not able to create a serializer for this.
{
"name": "Name",
"type": "record",
"namespace": "NameSpace",
"fields": [
{
"name": "discussions",
"comment": "discussion ID.",
"type": {
"type": "array",
"items": {
"name": "discussionsRecord",
"comment": "discussion Identifier.",
"type": "record",
"fields": [
{
"name": "discussionId",
"type": "long"
},
{
"name": "channelType",
"comment": "channel Type Identification.",
"type": "int"
},
{
"name": "data",
"comment": "The following block is to capture channel values.",
"type": {
"type": "array",
"items":
[
{
"name": "dataRecord",
"type": "record",
"fields": [
{
"name": "pulse",
"comment": "Pulse.",
"type": "long"
},
{
"name": "communicationName",
"comment": "communication Identification.",
"type": {
"name": "communicationNameEnumType",
"comment": "enum for communication Names.",
"type": "enum",
"symbols": [
"cold", "rainIntensity", "heat"
]
}
},
{
"name": "communicationValue",
"comment": "communication Values.",
"type": "double"
},
{
"name": "classValue",
"comment": "communication class.",
"type": {
"name": "classValueEnumType",
"comment": "enum for Class types.",
"type": "enum",
"symbols": [
"Dark", "Logical"
]
}
}
]
}
]
}
}
]
}
}
}
]
}
If you have an AVSC schema, you can create a SparkSQL schema like this (scala)
import org.apache.avro.Schema
import org.apache.spark.sql._
import org.apache.spark.sql.avro.SchemaConverters
val avroSchema : String = ...
val sparkSchema = SchemaConverters.toSqlType(new Schema.Parser().parse(avroSchema))
Otherwise, to_avro()
serializes an existing dataframe with its schema to Avro output
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.