简体   繁体   English

avro 架构问题:TypeError: unhashable type: 'dict'

[英]avro schema question: TypeError: unhashable type: 'dict'

I need to write a Avro schema for the following data.我需要为以下数据编写 Avro 架构。 The exposure is a array of arrays with 3 numbers.曝光的是3个数字的arrays数组。

{
"Response": {
    "status": "",
    "responseDetail": {
        "request_id": "Z618978.R",
        "exposure": [
            [
                372,
                20000000.0,
                31567227140.238808
            ]
            [
                373,
                480000000.0,
                96567227140.238808
            ]
            [
                374,
                23300000.0,
                251567627149.238808
            ]
        ],
        "product": "ABC",
    }
}
}

So I came up with a schema like the following:所以我想出了一个如下的模式:

{
"name": "Response",
"type":{
    "name": "algoResponseType",
    "type": "record",
    "fields":
    [
            {"name": "status", "type": ["null","string"]},
            {
            "name": "responseDetail",
            "type": {
                    "name": "responseDetailType",
                    "type": "record",
                    "fields":
                    [
                            {"name": "request_id", "type": "string"},
                            {
                            "name": "exposure",
                            "type": {
                                    "type": "array",
                                    "items":
                                    {
                                    "name": "single_exposure",
                                    "type": {
                                            "type": "array",
                                            "items": "string"
                                    }
                                    }
                            }
                            },
                            {"name": "product", "type": ["null","string"]}
                    ]
            }
            }
    ]
   }
}

When I tried to register the schema.当我尝试注册架构时。 I got the following error.我收到以下错误。 TypeError: unhashable type: 'dict' which means I used a list as a dictionary key. TypeError: unhashable type: 'dict' 这意味着我使用列表作为字典键。

Traceback (most recent call last):
  File "sa_publisher_main4test.py", line 28, in <module>
    schema_registry_client)
  File "/usr/local/lib64/python3.6/site-packages/confluent_kafka/schema_registry/avro.py", line 175, in __init__
    parsed_schema = parse_schema(schema_dict)
  File "fastavro/_schema.pyx", line 71, in fastavro._schema.parse_schema
  File "fastavro/_schema.pyx", line 204, in fastavro._schema._parse_schema
TypeError: unhashable type: 'dict'

Can anyone help point out what is causing the error?任何人都可以帮助指出导致错误的原因吗?

The error you get is because Schema Registry doesn't accept your schema.您收到的错误是因为架构注册表不接受您的架构。 Your top element has to be a record with "Response" field.您的顶部元素必须是带有“响应”字段的记录。

This schema should work, I changed array item type, as in your message you have float and not string.这个模式应该可以工作,我改变了数组项类型,因为在你的消息中你有浮点数而不是字符串。

{
    "type": "record",
    "name": "yourMessage",
    "fields": [
        {
            "name": "Response",
            "type": {
                "name": "AlgoResponseType",
                "type": "record",
                "fields": [
                    {
                        "name": "status",
                        "type": [
                            "null",
                            "string"
                        ]
                    },
                    {
                        "name": "responseDetail",
                        "type": {
                            "name": "ResponseDetailType",
                            "type": "record",
                            "fields": [
                                {
                                    "name": "request_id",
                                    "type": "string"
                                },
                                {
                                    "name": "exposure",
                                    "type": {
                                        "type": "array",
                                        "items": {
                                            "type": "array",
                                            "items": "float"
                                        }
                                    }
                                },
                                {
                                    "name": "product",
                                    "type": [
                                        "null",
                                        "string"
                                    ]
                                }
                            ]
                        }
                    }
                ]
            }
        }
    ]
}

Your message is not correct, as array elements should have comma between them.您的消息不正确,因为数组元素之间应该有逗号。

{
    "Response": {
        "status": "",
        "responseDetail": {
            "request_id": "Z618978.R",
            "exposure": [
                [
                    372,
                    20000000.0,
                    31567227140.238808
                ],
                [
                    373,
                    480000000.0,
                    96567227140.238808
                ],
                [
                    374,
                    23300000.0,
                    251567627149.238808
                ]
            ],
            "product": "ABC",
        }
    }
}

As you are using fastavro, I recommend running this code to check that your message is an example of a schema.当您使用 fastavro 时,我建议您运行此代码来检查您的消息是否是模式的示例。

from fastavro.validation import validate
import json

with open('schema.avsc', 'r') as schema_file:
    schema = json.loads(schema_file.read())

message = {
    "Response": {
        "status": "",
        "responseDetail": {
            "request_id": "Z618978.R",
            "exposure": [
                [
                    372,
                    20000000.0,
                    31567227140.238808
                ],
                [
                    373,
                    480000000.0,
                    96567227140.238808
                ],
                [
                    374,
                    23300000.0,
                    251567627149.238808
                ]
            ],
            "product": "ABC",
        }
    }
}

try:
    validate(message, schema)
    print('Message is matching schema')
except Exception as ex:
    print(ex)

There are a few issues.有几个问题。

First, at the very top level of your schema, you have the following:首先,在架构的最顶层,您拥有以下内容:

{
  "name": "Response",
  "type": {...}
}

But this isn't right.但这是不对的。 The top level should be a record type with a field called Response .顶层应该是一个记录类型,其中包含一个名为Response的字段。 So it should look like this:所以它应该是这样的:

{
  "name": "Response",
  "type": "record",
  "fields": [
    {
      "name": "Response",
      "type": {...}
    }
  ]
}

The second problem is that for the array of arrays, you currently have the following:第二个问题是,对于arrays这个数组,你目前有如下:

{
   "name":"exposure",
   "type":{
      "type":"array",
      "items":{
        "name":"single_exposure",
        "type":{
          "type":"array",
          "items":"string"
        }
     }
   }
}

But instead it should look like this:但它应该看起来像这样:

{
   "name":"exposure",
   "type":{
      "type":"array",
      "items":{
        "type":"array",
        "items":"string"
     }
   }
}

After fixing those, the schema will be able to be parsed, but your data contains an array of array of floats and your schema says it should be an array of array of string.修复这些之后,架构将能够被解析,但是您的数据包含一个浮点数组,并且您的架构说它应该是一个字符串数组的数组。 Therefore either the schema needs to be changed to float, or the data needs to be strings.因此,要么架构需要更改为浮动,要么数据需要是字符串。

For reference, here's an example script that works after fixing those issues:作为参考,这是一个在解决这些问题后可以工作的示例脚本:

import fastavro

s = {
   "name":"Response",
   "type":"record",
   "fields":[
      {
         "name":"Response",
         "type": {
            "name":"algoResponseType",
            "type":"record",
            "fields":[
               {
                  "name":"status",
                  "type":[
                     "null",
                     "string"
                  ]
               },
               {
                  "name":"responseDetail",
                  "type":{
                     "name":"responseDetailType",
                     "type":"record",
                     "fields":[
                        {
                           "name":"request_id",
                           "type":"string"
                        },
                        {
                           "name":"exposure",
                           "type":{
                              "type":"array",
                              "items":{
                                "type":"array",
                                "items":"string"
                             }
                           }
                        },
                        {
                           "name":"product",
                           "type":[
                              "null",
                              "string"
                           ]
                        }
                     ]
                  }
               }
            ]
         }
      }
   ]
}

data = {
   "Response":{
      "status":"",
      "responseDetail":{
         "request_id":"Z618978.R",
         "exposure":[
            [
               "372",
               "20000000.0",
               "31567227140.238808"
            ],
            [
               "373",
               "480000000.0",
               "96567227140.238808"
            ],
            [
               "374",
               "23300000.0",
               "251567627149.238808"
            ]
         ],
         "product":"ABC"
      }
   }
}

parsed_schema = fastavro.parse_schema(s)
fastavro.validate(data, parsed_schema)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM