简体   繁体   中英

Validating Avro schema that is referencing another schema

I am using the Python 3 avro_validator library.

The schema I want to validate references other schemas in sperate avro files. The files are in the same folder. How do I compile all the referenced schemas using the library?

Python code as follows:

from avro_validator.schema import Schema

schema_file = 'basketEvent.avsc'

schema = Schema(schema_file)
parsed_schema = schema.parse()

data_to_validate = {"test": "test"}

parsed_schema.validate(data_to_validate)

The error I get back:

ValueError: Error parsing the field [contentBasket]: The type [ContentBasket] is not recognized by Avro

And example Avro file(s) below:


basketEvent.avsc 

{
  "type": "record",
  "name": "BasketEvent",
  "doc": "Indicates that a user action has taken place with a basket",
  "fields": [
    {
      "default": "basket",
      "doc": "Restricts this event to having type = basket",
      "name": "event",
      "type": {
        "name": "BasketEventType",
        "symbols": ["basket"],
        "type": "enum"
      }
    },
    {
      "default": "create",
      "doc": "What is being done with the basket. Note: create / delete / update will always follow a product event",
      "name": "action",
      "type": {
        "name": "BasketEventAction",
        "symbols": ["create","delete","update","view"],
        "type": "enum"
      }
    },
    {
      "default": "ContentBasket",
      "doc": "The set of values that are specific to a Basket event",
      "name": "contentBasket",
      "type": "ContentBasket"
    },
    {
      "default": "ProductDetail",
      "doc": "The set of values that are specific to a Product event",
      "name": "productDetail",
      "type": "ProductDetail"
    },
    {
      "default": "Timestamp",
      "doc": "The time stamp for the event being sent",
      "name": "timestamp",
      "type": "Timestamp"
    }
  ]
}

contentBasket.avsc

{
  "name": "ContentBasket",
  "type": "record",
  "doc": "The set of values that are specific to a Basket event",
  "fields": [
    {
      "default": [],
      "doc": "A range of details about product / basket availability",
      "name": "availability",
      "type": {
        "type": "array",
        "items": "Availability"
      }
    },
    {
      "default": [],
      "doc": "A range of care pland applicable to the basket",
      "name": "carePlan",
      "type": {
        "type": "array",
        "items": "CarePlan"
      }
    },
    {
      "default": "Category",
      "name": "category",
      "type": "Category"
    },
    {
      "default": "",
      "doc": "Unique identfier for this basket",
      "name": "id",
      "type": "string"
    },
    {
      "default": "Price",
      "doc": "Overall pricing info about the basket as a whole - individual product pricings will be dealt with at a product level",
      "name": "price",
      "type": "Price"
    }
  ]
}

availability.avsc

{
  "name": "Availability",
  "type": "record",
  "doc": "A range of values relating to the availability of a product",
  "fields": [
    {
      "default": [],
      "doc": "A list of offers associated with the overall basket - product level offers will be dealt with on an individual product basis",
      "name": "shipping",
      "type": {
        "type": "array",
        "items": "Shipping"
      }
    },
    {
      "default": "",
      "doc": "The status of the product",
      "name": "stockStatus",
      "type": {
        "name": "StockStatus",
        "symbols": ["in stock","out of stock",""],
        "type": "enum"
      }
    },
    {
      "default": "",
      "doc": "The ID for the store when the stock can be collected, if relevant",
      "name": "storeId",
      "type": "string"
    },
    {
      "default": "",
      "doc": "The status of the product",
      "name": "type",
      "type": {
        "name": "AvailabilityType",
        "symbols": ["collection","shipping",""],
        "type": "enum"
      }
    }
  ]
}

maxDate.avsc

{
  "type": "record",
  "name": "MaxDate",
  "doc": "Indicates the timestamp for latest day a delivery should be made",
  "fields": [
    {
      "default": "Timestamp",
      "doc": "The time stamp for the delivery",
      "name": "timestamp",
      "type": "Timestamp"
    }
  ]
}

minDate.avsc

{
  "type": "record",
  "name": "MinDate",
  "doc": "Indicates the timestamp for earliest day a delivery should be made",
  "fields": [
    {
      "default": "Timestamp",
      "doc": "The time stamp for the delivery",
      "name": "timestamp",
      "type": "Timestamp"
    }
  ]
}

shipping.avsc

{
  "name": "Shipping",
  "type": "record",
  "doc": "A range of values relating to shipping a product for delivery",
  "fields": [
    {
      "default": "MaxDate",
      "name": "maxDate",
      "type": "MaxDate"
    },
    {
      "default": "MinDate",
      "name": "minDate",
      "type": "minDate"
    },
    {
      "default": 0,
      "doc": "Revenue generated from shipping - note, once a specific shipping object is selected, the more detailed revenye data sits within the one of object in pricing - this is more just to define if shipping is free or not",
      "name": "revenue",
      "type": "int"
    },
    {
      "default": "",
      "doc": "The shipping supplier",
      "name": "supplier",
      "type": "string"
    }
  ]
}

timestamp.avsc

{
  "name": "Timestamp",
  "type": "record",
  "doc": "Timestamp for the action taking place",
  "fields": [
    {
      "default": 0,
      "name": "timestampMs",
      "type": "long"
    },
    {
      "default": "",
      "doc": "Timestamp converted to a string in ISO format",
      "name": "isoTimestamp",
      "type": "string"
    }
  ]
}

I'm not sure if that library supports what you are trying to do, but fastavro should.

If you put the first schema in a file called BasketEvent.avsc and the second schema in a file called ContentBasket.avsc then you can do the following:

from fastavro.schema import load_schema
from fastavro import validate

schema = load_schema("BasketEvent.avsc")
validate({"test": "test"}, schema)

Note that when I tried to do this I got an error of fastavro._schema_common.UnknownType: Availability because it seems that there are other referenced schemas that you haven't posted here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM