简体   繁体   中英

avro schema question: TypeError: unhashable type: 'dict'

I need to write a Avro schema for the following data. The exposure is a array of arrays with 3 numbers.

{
"Response": {
    "status": "",
    "responseDetail": {
        "request_id": "Z618978.R",
        "exposure": [
            [
                372,
                20000000.0,
                31567227140.238808
            ]
            [
                373,
                480000000.0,
                96567227140.238808
            ]
            [
                374,
                23300000.0,
                251567627149.238808
            ]
        ],
        "product": "ABC",
    }
}
}

So I came up with a schema like the following:

{
"name": "Response",
"type":{
    "name": "algoResponseType",
    "type": "record",
    "fields":
    [
            {"name": "status", "type": ["null","string"]},
            {
            "name": "responseDetail",
            "type": {
                    "name": "responseDetailType",
                    "type": "record",
                    "fields":
                    [
                            {"name": "request_id", "type": "string"},
                            {
                            "name": "exposure",
                            "type": {
                                    "type": "array",
                                    "items":
                                    {
                                    "name": "single_exposure",
                                    "type": {
                                            "type": "array",
                                            "items": "string"
                                    }
                                    }
                            }
                            },
                            {"name": "product", "type": ["null","string"]}
                    ]
            }
            }
    ]
   }
}

When I tried to register the schema. I got the following error. TypeError: unhashable type: 'dict' which means I used a list as a dictionary key.

Traceback (most recent call last):
  File "sa_publisher_main4test.py", line 28, in <module>
    schema_registry_client)
  File "/usr/local/lib64/python3.6/site-packages/confluent_kafka/schema_registry/avro.py", line 175, in __init__
    parsed_schema = parse_schema(schema_dict)
  File "fastavro/_schema.pyx", line 71, in fastavro._schema.parse_schema
  File "fastavro/_schema.pyx", line 204, in fastavro._schema._parse_schema
TypeError: unhashable type: 'dict'

Can anyone help point out what is causing the error?

The error you get is because Schema Registry doesn't accept your schema. Your top element has to be a record with "Response" field.

This schema should work, I changed array item type, as in your message you have float and not string.

{
    "type": "record",
    "name": "yourMessage",
    "fields": [
        {
            "name": "Response",
            "type": {
                "name": "AlgoResponseType",
                "type": "record",
                "fields": [
                    {
                        "name": "status",
                        "type": [
                            "null",
                            "string"
                        ]
                    },
                    {
                        "name": "responseDetail",
                        "type": {
                            "name": "ResponseDetailType",
                            "type": "record",
                            "fields": [
                                {
                                    "name": "request_id",
                                    "type": "string"
                                },
                                {
                                    "name": "exposure",
                                    "type": {
                                        "type": "array",
                                        "items": {
                                            "type": "array",
                                            "items": "float"
                                        }
                                    }
                                },
                                {
                                    "name": "product",
                                    "type": [
                                        "null",
                                        "string"
                                    ]
                                }
                            ]
                        }
                    }
                ]
            }
        }
    ]
}

Your message is not correct, as array elements should have comma between them.

{
    "Response": {
        "status": "",
        "responseDetail": {
            "request_id": "Z618978.R",
            "exposure": [
                [
                    372,
                    20000000.0,
                    31567227140.238808
                ],
                [
                    373,
                    480000000.0,
                    96567227140.238808
                ],
                [
                    374,
                    23300000.0,
                    251567627149.238808
                ]
            ],
            "product": "ABC",
        }
    }
}

As you are using fastavro, I recommend running this code to check that your message is an example of a schema.

from fastavro.validation import validate
import json

with open('schema.avsc', 'r') as schema_file:
    schema = json.loads(schema_file.read())

message = {
    "Response": {
        "status": "",
        "responseDetail": {
            "request_id": "Z618978.R",
            "exposure": [
                [
                    372,
                    20000000.0,
                    31567227140.238808
                ],
                [
                    373,
                    480000000.0,
                    96567227140.238808
                ],
                [
                    374,
                    23300000.0,
                    251567627149.238808
                ]
            ],
            "product": "ABC",
        }
    }
}

try:
    validate(message, schema)
    print('Message is matching schema')
except Exception as ex:
    print(ex)

There are a few issues.

First, at the very top level of your schema, you have the following:

{
  "name": "Response",
  "type": {...}
}

But this isn't right. The top level should be a record type with a field called Response . So it should look like this:

{
  "name": "Response",
  "type": "record",
  "fields": [
    {
      "name": "Response",
      "type": {...}
    }
  ]
}

The second problem is that for the array of arrays, you currently have the following:

{
   "name":"exposure",
   "type":{
      "type":"array",
      "items":{
        "name":"single_exposure",
        "type":{
          "type":"array",
          "items":"string"
        }
     }
   }
}

But instead it should look like this:

{
   "name":"exposure",
   "type":{
      "type":"array",
      "items":{
        "type":"array",
        "items":"string"
     }
   }
}

After fixing those, the schema will be able to be parsed, but your data contains an array of array of floats and your schema says it should be an array of array of string. Therefore either the schema needs to be changed to float, or the data needs to be strings.

For reference, here's an example script that works after fixing those issues:

import fastavro

s = {
   "name":"Response",
   "type":"record",
   "fields":[
      {
         "name":"Response",
         "type": {
            "name":"algoResponseType",
            "type":"record",
            "fields":[
               {
                  "name":"status",
                  "type":[
                     "null",
                     "string"
                  ]
               },
               {
                  "name":"responseDetail",
                  "type":{
                     "name":"responseDetailType",
                     "type":"record",
                     "fields":[
                        {
                           "name":"request_id",
                           "type":"string"
                        },
                        {
                           "name":"exposure",
                           "type":{
                              "type":"array",
                              "items":{
                                "type":"array",
                                "items":"string"
                             }
                           }
                        },
                        {
                           "name":"product",
                           "type":[
                              "null",
                              "string"
                           ]
                        }
                     ]
                  }
               }
            ]
         }
      }
   ]
}

data = {
   "Response":{
      "status":"",
      "responseDetail":{
         "request_id":"Z618978.R",
         "exposure":[
            [
               "372",
               "20000000.0",
               "31567227140.238808"
            ],
            [
               "373",
               "480000000.0",
               "96567227140.238808"
            ],
            [
               "374",
               "23300000.0",
               "251567627149.238808"
            ]
         ],
         "product":"ABC"
      }
   }
}

parsed_schema = fastavro.parse_schema(s)
fastavro.validate(data, parsed_schema)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM