简体   繁体   English

如何在Avro架构中定义复杂类型

[英]How do I define a complex type in an Avro Schema

I have reviewed avro documentation as well as several examples online (and similar StackOverflow questions). 我已经查看了avro文档以及一些在线示例(以及类似的StackOverflow问题)。 I then attempted to define an avro schema, and had to progressively back out fields to determine what my issue was (the error message from the avro library in python was not as helpful as one would hope). 然后,我尝试定义avro模式,并且必须逐步退出字段以确定我的问题所在(来自python中的avro库的错误消息没有人们希望的那样有用)。 I have a JSON document that I would like to convert to Avro and I need a schema to be specified for that purpose (using avro-tools to generate the schema from the json did not work as expected and yielded an AvroTypeException when attempting to convert the json into avro). 我有一个要转换为Avro的JSON文档,我需要为此指定一个模式(使用avro-tools从json生成模式无法按预期工作,并在尝试转换为时产生了AvroTypeException json转换为avro)。 I am using Avro version 1.7.7. 我正在使用Avro版本1.7.7。 Here is the JSON document for which I would like to define the avro schema: 这是我要为其定义avro模式的JSON文档:

{
  "method": "Do_Thing",
  "code": 200,
  "reason": "OK",
  "siteId": {
    "string": "a1283632-121a-4a3f-9560-7b73830f94j8"
  }
}

I was able to define the schema for the non-complex types but not for the complex "siteId" field: 我能够为非复杂类型定义模式,但不能为复杂的“ siteId”字段定义模式:

{
  "namespace" : "com.example",
  "name" : "methodEvent",
  "type" :  "record",
  "fields" : [
    {"name": "method", "type": "string"},
    {"name": "code", "type": "int"},
    {"name": "reason", "type": "string"}
    {"name": "siteId", "type": [ "null", "string" ]}
  ]
}

Attempting to use the previous schema to convert the Json object to avro yields an avro.io.AvroTypeException: The datum [See JSON Object above] is not an example of the schema [See Avro Schema Object above]. 尝试使用以前的架构将Json对象转换为avro会产生avro.io.AvroTypeException:数据[参见上述JSON对象]不是该架构的示例[请参见上述Avro架构对象]。 I only see this error when attempting to define a field in the schema to represent the "siteId" field in the above json. 我只在尝试在架构中定义一个字段来表示上述json中的“ siteId”字段时才看到此错误。

Avro's python implementation represents unions differently than their JSON encoding: it "unwraps" them, so the siteId field is expected to be just the string, without the wrapping object. Avro的python实现表示的联合与JSON编码不同:它“解包”联合,因此siteId字段应该只是字符串,没有包装对象。 See below for a few examples. 请参见下面的一些示例。

Valid JSON encodings 有效的JSON编码

Non-null siteid : 非null siteid

{
  "method": "Do_Thing",
  "code": 200,
  "reason": "OK",
  "siteId": {
    "string": "a1283632-121a-4a3f-9560-7b73830f94j8"
  }
}

Null siteid : siteid

{
  "method": "Do_Thing",
  "code": 200,
  "reason": "OK",
  "siteId": null
}

Valid python objects (in-memory representation) 有效的python对象(内存中的表示形式)

Non-null siteid : 非null siteid

{
  "method": "Do_Thing",
  "code": 200,
  "reason": "OK",
  "siteId": "a1283632-121a-4a3f-9560-7b73830f94j8"
}

Null siteid : siteid

{
  "method": "Do_Thing",
  "code": 200,
  "reason": "OK",
  "siteId": null
}

Note that null s are unwrapped in both cases which is why your solution isn't working. 请注意,在两种情况下都将null 展开 ,这就是您的解决方案无法正常工作的原因。

Unfortunately, the python implementation doesn't have a JSON decoder/encoder currently (AFAIK), so there is no easy way to translate between the two representations. 不幸的是,python实现当前没有JSON解码器/编码器(AFAIK),因此没有简单的方法可以在两种表示形式之间进行转换。 Depending on the source of your JSON-encoded data, the simplest might be to edit it to not wrap union instances anymore. 根据您的JSON编码数据的来源,最简单的方法可能是对其进行编辑以不再包装联合实例。

I was able to resolve the issue with the following schema: 我可以使用以下架构解决问题:

{
  "namespace" : "com.example",
  "name" : "methodEvent",
  "type" :  "record",
  "fields" : [
    {"name": "method", "type": "string"},
    {"name": "code", "type": "int"},
    {"name": "reason", "type": "string"}
    {
      "name": "siteId", 
      "type": {
        "name" : "siteId",
        "type" : "record",
        "fields" : [
          "name" : "string",
          "type" : [ "null", "string" ]
        ]
      }
    },
    "default" : null
  ]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM