简体   繁体   中英

Serializer for Avro Schema

I am new to Avro Schema. I had created the following schema based upon reference JSON but I am not able to create a serializer for this.

{
  "name": "Name",
  "type": "record",
  "namespace": "NameSpace",
  "fields": [
    {
      "name": "discussions",
      "comment": "discussion ID.",
      "type": {
        "type": "array",
        "items": {
          "name": "discussionsRecord",
          "comment": "discussion Identifier.",
          "type": "record",
          "fields": [
            {
              "name": "discussionId",
              "type": "long"
            },
            {
              "name": "channelType",
              "comment": "channel Type Identification.",
              "type": "int"
            },
            {
              "name": "data",
              "comment": "The following block is to capture channel values.",
              "type": {
                "type": "array",
                "items": 
                [
                   {
                      "name": "dataRecord",
                      "type": "record",
                      "fields": [
                        {
                          "name": "pulse",
                          "comment": "Pulse.",
                          "type": "long"
                        },
                        {
                          "name": "communicationName",
                          "comment": "communication Identification.",
                          "type": {
                          "name": "communicationNameEnumType",
                          "comment": "enum for communication Names.",
                          "type": "enum",
                          "symbols": [
                          "cold", "rainIntensity", "heat"
                                     ]
                                  }
                        },
                        {
                          "name": "communicationValue",
                          "comment": "communication Values.",
                          "type": "double"
                        },
                        {
                          "name": "classValue",
                          "comment": "communication class.",
                          "type": {
                          "name": "classValueEnumType",
                          "comment": "enum for Class types.",
                          "type": "enum",
                          "symbols": [
                          "Dark", "Logical"
                                     ]
                                  }
                        }
                      ]
                    }
                ]
              }
            }
          ]
        }
      }
    }
  ]
}

If you have an AVSC schema, you can create a SparkSQL schema like this (scala)

import org.apache.avro.Schema
import org.apache.spark.sql._
import org.apache.spark.sql.avro.SchemaConverters

val avroSchema : String = ...
val sparkSchema = SchemaConverters.toSqlType(new Schema.Parser().parse(avroSchema))

Otherwise, to_avro() serializes an existing dataframe with its schema to Avro output

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM